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Abstract 

In this work, we study the performance of random isometric precoders over quasi-static and correlated fading 
channels. We derive deterministic approximations of the mutual information and the signal-to-interference-plus- 
noise ratio (SINR) at the output of the minimum-mean-square-error (MMSE) receiver and provide simple provably 
converging fixed-point algorithms for their computation. Although these approximations are only proven exact in 
the asymptotic regime with infinitely many antennas at the transmitters and receivers, simulations suggest that they 
closely match the performance of small-dimensional systems. We exemplarily apply our results to the performance 
analysis of multi-cellular communication systems, multiple-input multiple-output multiple-access channels (MIMO- 
MAC), and MIMO interference channels. The mathematical analysis is based on the Stieltjes transform method. This 
enables the derivation of deterministic equivalents of functionals of large-dimensional random matrices. In contrast to 
previous works, our analysis does not rely on arguments from free probability theory which enables the consideration 
of random matrix models for which asymptotic freeness does not hold. Thus, the results of this work are also a novel 
contribution to the field of random matrix theory and applicable to a wide spectrum of practical systems. 
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I. Introduction 
Consider the following discrete time wireless channel model 

K 

y = ^H fe W fe p|x fe + n (1) 

k=i 

where 

(i) yet is the channel output vector, 

(ii) Hfe e C NxNk , k e {1, . . . , K}, are complex channel matrices, satisfying either of the following properties: 
(ii-a) The matrix Hfe is deterministic. In this case, we will denote Rfe = HfeHjj. 

(ii-b) The matrix is a random channel matrix whose jth column vector h^j Gl is modeled as 

i 

h kj = R-kj Z kj, i e {1, . . . , N k } (2) 

where Rfej <E <C NxN are Hermitian nonnegative definite matrices and the vectors z k j e have 
independent and identically distributed (i.i.d.) elements with zero mean, variance 1/N and 4 + e moment 
of order 0(1/ N 2+e / 2 ), for some common e > 0. 

(iii) Wfe e Q NkXnk ^ g {1, . . . ,K}, are complex (signature or precoding) matrices which contain each n k < N k 
orthonormal columns of independent N k x N k Haar-distributed random unitary matrices, 1 

(iv) Pfe € M. nkXrik , k e {1, . . . , K}, are diagonal (power loading) matrices with nonnegative entries, 

(v) Xfe ~ CN(0, l nk ), k € {1, . . . , if}, are random independent transmit vectors, 

(vi) n <~ CN(0, cr 2 Iiv) is a noise vector. 

In addition, we define the ratios of the matrix dimensions a = ^ and Cj = for i e {1, . . . , if}. 

Remark 1: The statistical model (2) of the channel Hfe under assumption (ii-b) generalizes several well-known 
fading channel models of interest (see [1], [2] for examples). These models comprise in particular the Kronecker 
channel model with transmit and receive correlation matrices [3], [4], where the matrices H fe are given by 

H fe =R|z fe T| (3) 

with Z fe G C NxNk a random matrix whose elements are independent C3\T(0, l/N) and Rfe € C , g 
antenna correlation matrices. Since both Z fe and Wfe are unitarily invariant, we can assume without loss 
of generality for the statistical properties of y that Tfe = diag(tfei, . . . ,t k N k )- Defining the matrices H k j = t^Rt 
for j e {1, . . . , N k }, we fall back to the channel model in (2). Taking instead all Rfej to be diagonal matrices 
makes the entries of Hfe independent with [Hfe] i:) of zero mean and variance [TLhj]u/N. This corresponds to a 
centered variance profile model, studied extensively in [5], [6], [7]. 

The objective of this work is to study the performance of the communication channel (1) in the large dimensional 
regime where N, N\ , . . . , Nk, n\, . . . , uk are simultaneously large. In the following, we will consider both the 

'We recall that a Haar random matrix W fe e is defined by W fe = X fe (X£X fc ) 2 for X fe a random matrix with independent 

entries CW(0, 1) entries. 
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quasi-static channel scenario which assumes hypotheses (i), (ii-a), (iii)-(vi), and the fading channel scenario which 
assumes (i), (ii-b), (iii)-(vi). The study of the latter naturally arises as an extension of the study of the quasi-static 
channel scenario. The respective application contexts and an overview of related works for both scenarios are 
summarized below. 

A. Quasi-static channel scenario (hypothesis (ii-a)) 

Possible applications of the channel model (1) under assumptions (i), (ii-a), (iii)-(vi) arise in the study of direct- 
sequence (DS) or multi-carrier (MC) code-division multiple-access (CDMA) systems with isometric signatures over 
frequency-selective fading channels or space-division multiple-access (SDMA) systems with isometric precoding 
matrices over flat-fading channels. More precisely, for DS-CDMA systems, the matrices are either Toeplitz or 
circulant matrices (if a cyclic prefix is used) constructed from the channel impulse response; for MC-CDMA, the 
matrices are diagonal and represent the channel frequency response on each sub-carrier; for flat fading SDMA 
systems, the matrices can be of arbitrary form and their elements represent the complex channel gains between 
the transmit and receive antennas. In all cases, the diagonal entries of the matrices P k determine the transmit power 
of each signature (CDMA) or transmit stream (SDMA). 

The large system analysis of random i.i.d. and random orthogonal precoded systems with optimal and sub-optimal 
linear receivers has been the subject of numerous publications. The asymptotic performance of minimum-mean- 
square-error (MMSE) receivers for the channel model (1) for the case K = l,Pi = l ni , and Hx diagonal with 
i.i.d. elements has been studied in [8] relying on results from free probability theory. This result was extended to 
frequency-selective fading channels and sub-optimal receivers in [9]. Although not published, the associated mutual 
information was evaluated in [10] (this result is recalled in [11, Theorem 4.11]). The case of i.i.d. and isometric 
MC-CDMA over Rayleigh fading channels with multiple signatures per user terminal, i.e., K > 1 and diagonal 
with i.i.d. complex Gaussian entries, was considered in [12], where approximate solutions of the signal-to-noise- 
plus-interference-ratio (SINR) at the output of the MMSE receiver were provided. Asymptotic expressions for the 
spectral efficiency of the same model were then derived in [13]. DS-CDMA over flat-fading channels, i.e., K > 1, 
rik = N, and = Ijv for all k, was studied in [14], where the authors derived deterministic equivalents of the 
Shannon- and ^-transform based on the asymptotic freeness [11, Section 3.5] of the matrices WfePfeWjj. Besides, a 
sum-rate maximizing power-allocation algorithm was proposed. Finally, a different approach via incremental matrix 
expansion [15] led to the exact characterization of the asymptotic SINR of the MMSE receiver for the general 
channel model (1). However, the previously mentioned works share the underlying assumption that the spectral 
distributions of the matrices Hfe and Pfe converge to some limiting distributions or that the matrices H^H^ are 
jointly diagonalizable. 2 In addition, the computation of the asymptotic SINR requires the computation of rather 
complicated implicit equations. These can be solved in most cases by standard fixed-point algorithms but a proof of 
convergence to the correct solution was not provided. Finally, a closed-form expression for the asymptotic spectral 

2 That is, there exists a unitary matrix V such that VH t HjV H is diagonal for all k. 
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efficiency is missing, although an approximate solution which requires numerical integration was presented in [13]. 
Alternative combinatoric methods also exist, such as the diagrammatic approach [16], to evaluate the successive 
moments of the limiting eigenvalue distribution of such matrix models. 

The above results assume non-random communication channels H fe and can only be applied to the performance 
analysis of static or slow fading channels. Turning the matrices Hfe into random matrices instead allows for the 
study of the ergodic performance of fast fading channels with isometric precoders. The next section discusses the 
practical applications in this broader context. 

B. Fading channel scenario (hypothesis (ii-b)) 

The second scenario considers the channel model (1) under assumptions (i), (ii-b), (iii)-(vi). In contrast to the 
first scenario, the Hfe matrices are now assumed to be random. Thus, we aim at evaluating both the instantaneous 
performance for a random channel realization and the ergodic performance. These are appropriate performance 
measures in fast fading environments. 

Of particular interest in this setting is the evaluation of the multiple-input multiple-output (MIMO) channel 
capacity under random beamforming. In point-to-point MIMO channels, the ergodic channel capacity has been the 
object of numerous works and is by now well understood [17], [18]. However, the ergodic sum-rate of more involved 
models, such as the MIMO multiple access channel (MIMO-MAC) [4] under individual or sum power constraints, 
has been studied only recently within the scope of random matrix theory. Another important aspect is the capacity 
of MIMO channels with co-channel interference, for which much less is known about the optimal transmission 
strategies [19], [20]. The first interesting question relates to the problem of how many antennas should be used 
for transmission and how many independent data streams should be sent, which are the same problem when the 
channels have i.i.d. entries. With transmit antenna correlation, however, it makes a difference which antennas are 
selected for transmission and the question of the optimal number of antennas to be used becomes a combinatorial 
problem. To circumvent this issue, random beamforming can be used. The remaining question is then how many 
orthogonal streams should be sent, using all available antennas. We will address this problem later in this article, as 
our results enable the evaluation of the sum-rate of systems composed of multiple transmitter-receiver pairs, each 
applying random isotropic beamforming. 

In summary, regardless of the specific application scenario of the model (1), unitary precoders have gained 
significant interest in wireless communications [21] (see also the recent work on spatial multiplexing systems [22] 
and limited feedback beamforming solutions in future wireless standards [23]). Thus, the performance evaluation 
of isometric precoded systems is compulsory and a field of active research [24]. 

C. Contributions 

The object of this article is to propose a new framework for the analysis of large random matrix models involving 
Haar matrices using the Stieltjes-transform method initiated by Pastur and fully exploited by Bai and Silverstein 
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[25], [26]. This method is considered today as one of the most practical and powerful tools for handling large 
random matrices in wireless communications research. Our analysis is fundamentally based on a trace lemma for 
Haar matrices first provided in [8] and recalled in Lemma 5 (Appendix F). Unlike previous contributions, we 
dismiss most of the practical constraints of free probability theory, combinatorial and incremental matrix expansion 
methods, such as the need for spectral limits of the deterministic matrices in the model to exist, or the need for 
the matrices H^H^ to be diagonalizable in a common eigenvector basis. The expressions we derive appear to be 
very similar to previously derived expressions when the precoding matrices have i.i.d. entries instead of being 
Haar distributed (see in particular Remark 2). This allows for a unified understanding of both models with i.i.d. or 
Haar matrices. As a consequence, we believe that the generality of the theoretical results presented in this article, 
supported by a large scope of application contexts, might stimulate further related research. We also mention that 
an alternative method to prove the results of this paper could be based on the integration by parts formula for 
Gaussian random matrices developed by Pastur [27]. 

Before summarizing our main contributions, we introduce some definitions which will be of repeated use. The 
central object of interest is the matrix B^v € C NxN , defined as 

K 

Bjv^HfcWfcPfcWyH^. 

k=l 

We denote by In{<? 2 ) the normalized mutual information of the channel (1), given by [28] 

I N {(T 2 ) = logdct ^Ijv + ^Bjvj (nats/s/Hz). 

We further denote by 7^(ct 2 ) the SINR at the output of the linear MMSE detector for the jth component of the 
transmit vector x fe , which reads [29] 

where Bjv(fe,j) = Bat — pfejHfeWfejW^Hy and wjy is the jth column of W^. We then define the normalized 
sum-rate with MMSE detection as 

1 K " fc 

^ 2 ) = ^EE 1 °g( 1 +^(- 2 ))- 

fc=l j=l 

Depending on whether we consider the quasi-static channel scenario (ii-a) or the fading channel scenario (ii-b), we 
rename I^(a 2 ) by iff (a 2 ) and iff (a 2 ), the mutual information under hypothesis (ii-a) and (ii-b), respectively. 
The same holds for 7^-(<7 2 ) and R N {a 2 ). 

The technical contributions of this paper are as follows: we derive deterministic approximations In(<j 2 ), 7^(c 2 ), 
and Rn(<J 2 ) of In((j 2 ), ■fjiiv 2 ), and Rn(<j 2 ), respectively, which are (almost surely) asymptotically tight as the 
system dimensions N,Nj,Tii grow large at the same rate (denoted simply N — > oo). These approximations, often 
referred to as deterministic equivalents, are easy to compute as they are shown to be the limits of simple (provably 
converging) fixed-point algorithms, they are given in closed form and do not require any numerical integration, and 
they require only very general conditions on the matrices Hfe and Pfc. 
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We then present several applications of our results to wireless communications. First, we consider a cellular uplink 
orthogonal SDMA communication model with inter-cell interference, assuming independent codes in adjacent cells 
and quasi-static channels at all communication pairs. We then study a MIMO multiple access channel (MAC) from 
several multi-antenna transmitters to a multi-antenna receiver under the fading channel scenario (hypothesis (ii-b)). 
The transmitters are unaware of the channel realizations and send an arbitrary number of independent data streams 
using isometric random beamforming vectors. The receiver is assumed to be aware of all instantaneous channel 
realizations and beamforming vectors. Under this setting, we derive an approximation for the achievable sum-rate 
and mutual information. Finally, we address the problem of finding the optimal number of independent streams to be 
transmitted in a two-by-two interference channel. Although the use of deterministic approximations in this context 
requires an exhaustive search over all possible stream-configurations, it is computationally much less expensive 
than Monte Carlo simulations. Extensions to more than two transmit-receive pairs and possible different objective 
functions, e.g., weighted sum-rate or sum-rate with MMSE decoding, are straightforward and not presented. 

For all these applications, numerical simulations show that the deterministic approximations are very tight even 
for small system dimensions. In the interference channel model, these simulations suggest in particular that, at low 
SNR, it is optimal to use all streams while, at high SNR, stream-control, i.e., transmitting less than the maximal 
number of streams, is beneficial. 

Our work also constitutes a novel contribution to the field of random matrix theory as we introduce new proof 
techniques based on the Stieltjes transform method for random isometric matrices. Namely, we provide in Theorem 7 
(Appendix A) a deterministic equivalent Fjv of the eigenvalue distribution i*V of Bjv, referred to as the empirical 
spectrum distribution (e.s.d.). That is, Fn is such that, as N —> oo, Fjy — Fjy =>■ 0, this convergence being valid 
almost surely. Although deterministic equivalents of e.s.d. are by now more or less standard and have been developed 
for rather involved random matrix models [5], [4], [1], results for the case of isometric (Haar) matrices are still an 
exception. In particular, most results on Haar matrices are based on the assumption of asymptotic freeness of the 
underlying matrices, a requirement which is rarely met for the matrices in the channel model (1) of interest here. 
The approach taken in this work is therefore novel as it does not rely on free probability theory [30], [31] and we 
do not require any of the matrices in (1) to be asymptotically free. Interestingly, a very recent extension of free 
probability theory, coined free deterministic equivalents [32], has come as a response to the present article in which 
free probability tools are developed to tackle the aforementioned limitations. 

The remainder of this article is structured as follows: in Section II, we introduce the main results of this work, 
the proofs of which are postponed to the appendices. In Section III, the results are applied to the practical wireless 
communication models discussed above. Section IV concludes the article. 

Notations: Boldface lower and upper case symbols represent vectors and matrices, respectively. Ijv is the size-N 
identity matrix and diag(ici, . . . ,Xn) is a diagonal matrix with elements Xi. The trace, transpose and Hermitian 
transpose operators are denoted by tr(-), (-) T and (-) H , respectively. The spectral norm of a matrix A is denoted 
by || A||, and, for two matrices A and B, the notation A >- B means that A — B is positive-definite. The notations 
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=> and denote weak and almost sure convergence, respectively. We use eN(m, R) to denote the circular 
symmetric complex Gaussian distribution with mean m and covariance matrix R. We denote by M + the set [0, oo) 
and by C + the set {z G C, Im[z] > 0}. Denote by C(A, Y) the set of continuous functions from X C C to Y C C, 
by !K(X, Y) the set of holomorphic functions from X C C to Y C C, and by B(X) the class Stieltjes transforms 
of finite measures supported bylcK (see Definition 1 in Appendix A). 

II. Main results 

In this section, we present the main results of the article. All proofs are deferred to the appendices. We 
will distinguish the results for the quasi-static and the fading channel scenarios. Since we will make limiting 
considerations as the system dimensions grow large, some technical assumptions will be necessary: 

Al The notation N — > oo denotes the simultaneous growth of N, Ni , rij for all i, in such a way that the ratios 
Cj = jf: and q = ^ satisfy < liming Cj < limsupjy Cj < 1 and < liminf/v Cj < Km sup w c, < oo. 
For all convergence results in this paper (as N — > oo), the matrices = Pfc(A) £ ]j>™fc x ™^ _ jj^JV) g 
C NxJV,= (as well as the R fcj = Rjy(JV) e C NxN under assumption (ii-b)), and W k = W k (N) E C Nl<xnk should 
be understood as sequences of (random) matrices with growing dimensions. Wherever this is clear from the context, 
we drop the dependence on N to simplify the notations. 

In order to control the power loading matrices as the system grows large, we need the following assumption: 

A2 There exists P > such that, for all k, limsup^HPfcH < P. 

Under (ii-a), the channel gains will need to remain bounded for all large N: 

A3-a There exists R > such that maxfc limsup w ||Rfe|| < R, where we recall that R^ = HfeHjj. 
The equivalent constraint under (ii-b) is that the channel correlations remain bounded for all large N: 
A3-b There exists R > such that lim supjy 1 1 Rfe j | < R for all j, k. 

Due to some technical issues, it will be sometimes necessary to require the following condition: 
A4 For all random matrices within a set of probability one, there exists M > such that maxfc ||HfeH^ || < 
M for all large N. 

Assumption A4 is met in particular in the situation when there exists m > 0, such that for all k,j,N, R/y € 3?at 
with 3l N a discrete set of cardinality |31jv| < m for all N (see the arguments in [4]). For example, this holds true 
for the scenario of a common correlation matrix at each receiver, i.e., R^ = R& are equal for all j. 

A. Fundamental Equations 

We first introduce the fundamental equations for model (1). These equations provide the core deterministic 
quantities that will define the deterministic equivalents for 7jv(a 2 ), 7^(cr 2 ), and i?]v(c 2 )- 
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Theorem 1 (Fundamental equations under (ii-a)): Consider the system model (1) under assumptions (i), (ii-a), 
(iii)-(vi). Let a 2 > 0. Then the following system of implicit equations 



trP fc (a fe (er 2 )P fe + [c k - a k (a 2 )a k (cr 2 )}I nk ) 1 

l 



(4) 



a k {a 2 ) = ^trR fe ^ a 3 (a 2 )^ + a 2 I 

with k G {1, . . . , K}, admits a unique solution such that, for all k, a k (a 2 ), a k (a 2 ) > 0, and < a k (a 2 )a k (a 2 ) < 
c k c k . Moreover, this solution is obtained explicitly by the following fixed-point algorithm 

a k (a 2 ) = lim <#V)> a k (a 2 ) = lim ajjfV). a£V) = lim ajjf'V) 
where, for fc e {1, . . . , X}, 

r 2 ) = i trP fc (a<V)P* + [c k aiV^V)]^ ' 

with initial values 4''°V 2 ) = and 4°V 2 ) = 0. 

Proof: The proof is provided in Appendix A. ■ 

Remark 2: Assume c k = 1 for every k (e.g., when is a Toeplitz matrix as in the CDMA case). Extending 
every P k G {^ rl fc x ™fc mt0 jy x jy ma trices filled with zeros, we may assume Cfe = 1 without affecting the final 
result. In this scenario, the fundamental equations (1) under (ii-a) become 

a k {a 2 ) = i trP fe (a fe (<7 2 )P fe + [1 - a^ 2 )^ 2 )]^) -1 (5) 

K 



1 






trP, 


~~ TV 


1 






tr R, 


~~ N 





A 

-l 



afc(ff 2 ) = ^trR fe K^a j (a 2 )R j + a 2 I 



N 

3=1 

This can be compared to the scenario where the matrices W k , instead of being Haar matrices, have i.i.d. entries of 
variance 1/N. The fundamental equations of this model were derived in [4, Corollary 1] and are given as follows: 

« fc (^ 2 ) = ^trP fe (a fe ( ( 7 2 )P fe +I Ar )" 1 (6) 

s -1 

K 



a k (a 2 ) = ltrR fc ( £ a j (a 2 )R j + o 2 \ 



N 

3 = 1 



such that a k (a 2 ) is positive for all k. The scalars a k (a 2 ) and a k (a 2 ) are also defined as the limits of a classical fixed- 



point algorithm. The only difference between the two sets of equations lies in the additional term — a k (a 2 )a k (cr 2 )l 
in (5), not present in (6). 



N 



We now turn to the fundamental equations in the fading channel context. 



Theorem 2 (Fundamental equations under (ii-b)): Consider the system model (1) under assumptions (i), (ii-b), 
(iii)-(vi). Let a 2 > 0. Then, the following system of implicit equations 



b k {a 2 ) = -^trP fe (& fe (<T 2 )P fc + [c fc - b k (a 2 )b k (a 2 )] I„ fc ) 

h = 1 V ^) 

k[ ' N^l + b k (a 2 )C kj (a 2 ) 



K N " u 



with k e {1, . . . , if}, admits a unique solution satisfying Cfcj(°' 2 )> b k (<T 2 ) 7 b k ( tj2 ) > and < b k (a 2 )b k (a 2 ) < c k c k 
for all fc, j. Moreover, this solution is given explicitly by the following fixed-point algorithm 

b k (a 2 )= lirn&iV), MO = !™ 4V). CfcV) = A m 



where 



cgV 2 ) 



kj 



1 

c,, <» ) - ^h* U EE 1 + r . V)<£ .- V) + 



with the initial values c£'°V 2 ) = V^ 2 , ^ 0) = and ^'(ct 2 ) = for all k,j. 

Proof: The proof is provided in Appendix D. ■ 

B. System performance 

The following results are all based on the fundamental equations of Theorem 1 and Theorem 2. 
Theorem 3 (Mutual information under (ii-a)): Consider the system model (1) under assumptions (i), (ii-a), (iii)- 
(vi), and denote, for a 2 > 0, 

4"V) = ^ log dot (l N + -^B N 
Assume Al, A2, and A3-a. Then, as N — > oo, 

e4°V)-4°V)->o 
4 a V 2 )-4V)^o 
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where 



4 a V) = ^logdct(lAr 



N 

K 

E 



fe=i / 



fe=i 

with a fe = a fc (cr 2 ), a fc = a k (a 2 ), fee {1, . . . ,K}, given by Theorem 1. 

Proof: The proof is provided in Appendix B. ■ 

Theorem 4 (Mutual information under (ii-b)): Consider the system model (1) under assumptions (i), (ii-b), (iii)- 
(vi), and denote, for a 2 > 0, 



^ logdet ([c fe - a k a k ]l nk + a k P k ) + (1 - c k )c k log(c fe - a k a k ) - c k \og{c k ) 



(7) 



4V) = ll0gdct [ 1 N + ^ 



Assume Al, A2, A3-b, and A4. Let b k = b k (<r 2 ), b k = b k (<r 2 ) and ( k j — ( k j((J 2 ) for all k,j be defined as in 
Theorem 2. Then, as N — > oo, 

E/W(a 2 )-/W(a 2 )-.0 

4V)-4V)^o 

where 

/^(cr 2 ) = V N (a 2 ) + — ^2 logdet ([c fe - 6 fe 6fe] I„ fc + fe fe Pfe) + ^(1 - c fe )c fe log(c fc - 6 fe 6 fe ) - c fc log(c fc ) 

k=l k=l 
1 / 1 1 X ^ ?i R \ K K N k 

MO = ± logdet [l N + ^ 1 2 £ - £ ^ + ^£EM 1 + 5*Gy) • (8) 

y fe=ij=i ^ fesfcjy fe=1 fe=ij=i 

Proof: The proof is provided in Appendix E. ■ 

Theorem 5 (SINR of the MMSE detector under (ii-a)): Consider the system model (1) under assumptions (i), 
(ii-a), (iii)-(vi) and, for a 2 > 0, denote 

7 fc ?V) = P/wwEX ( B ^(M) + ^nY 1 H fe w fej . (9) 
Assume Al, A2, and A3-a. Then, as N — > oo, 

7S (a) (- 2 )-7S (a) (- 2 )^>0 

where 

^.^(ct 2 ) = Pkjak 
kj c k - a k a k 

with afe = afe(cr 2 ) and a k = a k (a 2 ) defined in Theorem 1. 

Proof: The proof is provided in Appendix C. ■ 



As an (almost immediate) corollary, we have the following result. 
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Corollary 1: Under the conditions of Theorem 5, denote 

K n k 



< a V 2 )^EE^i+7 fe rV))- 



fc=l j=l 
Then, 

where 

K n 



^ ) (0 = ^EE 1 °g( 1+ < (0) ^)- 



fc=l j=l 

Proof: The proof is provided in Appendix C. ■ 

Theorem 6 (SINR of the MMSE detector under (ii-b)): Consider the system model (1) under assumptions (i), 
(ii-b), (iii)-(vi) and, for a 2 > 0, denote 



7jy 6) (* 2 ) = PkjWkjH-k i B N(k,j) + cr 2 Iiv) 1 HfeW fej . 
Assume Al, A2, A3-b, and A4. Then, as N — > oo, 

where 

^(6), 2 x _ Pkjh 

c fe - b k b k 

with 6^ = bk(cr 2 ) and = bk(a 2 ), given by Theorem 2. 
Proof: The proof is provided in Appendix E. 

Similar to the quasi-static channel scenario, we also have the following corollary. 
Corollary 2: Under the conditions of Theorem 6, denote 

K n 



k=i j=i 

Then, 
where 

^ ) (0 = ^EE^( 1+ < (b) ^))- 
fc=i j=i 

Proof: The proof is provided in Appendix E 



12 



Remark 3: Surprisingly, the fundamental equations of Theorems 1 and 2 cannot be solved with the proposed 
fixed-point algorithms for the case Cfe = 1 when the entries of Pfe are all non-zero (recall that assumption (iii) of 
the model imposes c k < 1). Moreover, the proof of Theorem 7 in the appendix cannot be easily extended to this 
case. However, if P fe = p k lN k , f° r some p k > 0, the random matrix B N reduces to 

K K 

B JV = 5^pfcH fc HjjJ=5^p fc R fc . (10) 

k=l k=l 

For the quasi-static channel scenario, B^v is thus entirely deterministic. A careful inspection of the fixed-point 
equations of Theorem 1 reveals that a k , with definition extended to c k = 1, has two solutions in the adherence of 
[0, c/s/afe), i.e., a k = |£ or a k = p k . Simulations suggest that, in this scenario, the fixed-point algorithm proposed 
in Theorem 1 may converge to either of the solutions depending on the choice of the system parameters. Note that, 
for a k = p k , Theorem 3 reduces to 



iff (<**) - ^ log dot (l N + ± P* R *) 



as it should be. As for a k = this cannot lead to a correct solution as iff "(a 2 ) would be independent of p k . 
These observations are consistent with the condition a k < Similarly, b k in Theorem 2 has the same two possible 
solutions in this scenario. With b k = p k , the asymptotic mutual information reduces to 

lff(v 2 ) = V N (a 2 ) 

which is the asymptotic mutual information of a channel with a generalized variance profile as provided in 
Theorem 10 (Appendix G). Thus our results are consistent for the case c k = 1 and Pfe = p k lN k - However, if 
the entries of P fe are not all equal and c k = 1, we cannot easily infer the solutions of a k , b k and the proposed 
fixed point algorithms may not converge to the correct solutions. 

Remark 4: Based on the previous remark, under scenario (ii-b) with K = 1, Pi = I ni , Ni = n\ = N, and 
Rij = Ijv for all j, the set of implicit equations in Theorem 2 reduces to: 

6(a 2 ) = l, g(a 2 )= U-"') = — 



C(^ 2 ) ,,2, = 1 



i+C(^ 2 ) 

which has a unique solution satisfying ((a 2 ) > and that can be given in closed-form: 



C(- 2 ) = - V 



2 

We recognize that ((a 2 ) is the Stieltjes transform of the Marcenko-Pastur law with scale parameter 1 [11, Equa- 
tion (3.20)] evaluated on the negative real axis. This result is consistent with our expectations since B^r = ZiZ^, 
where Zi <E <C NxN has i.i.d. entries with zero mean and variance l/N. Moreover, the expression of the normalized 
asymptotic mutual information as given in Theorem 4 reduces to 

Iff (a 2 ) = V N {a 2 ) = log (1 + C(a 2 ) + I/* 2 ) 
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Fig. 1. Three-cell example: The BS in the center cell decodes the n streams from the UT in its own cell while treating the other signals as 
interference. 



which is consistent with the asymptotic spectral efficiency of a Rayleigh-fading NxN MIMO channel [33, Equation 
(9)] (see also [11, Section 13.2.2]). Equivalently, the asymptotic SINR of the MMSE detector and the associated 
normalized sum-rate can be given as (cf. [33, Proposition VI. 1]): 

7f b) =(V), i^(a 2 ) = log(l + C(<7 2 )). 

Remark 5: Technically, the results obtained for the quasi-static scenario unfold from the Stieltjes transform 
framework very similar to [4], [5]. However, some new tools are introduced which simplify the analysis made in 
these papers, such as the method of standard interference functions to prove existence and uniqueness of the derived 
deterministic equivalents. As for the results in the fading channel scenario, they unfold from the conjugation of the 
results obtained in the quasi-static scenario and the results obtained in [1] (recalled in Appendix G) for a channel 
model similar to (1) but without the presence of the Wj. matrices. The central tool to allow this conjugation is 
the Tonelli (or Fubini) theorem, Lemma 9 in Appendix F, on the product probability space engendering both the 
(sequences of growing) and matrices. 

III. Numerical results 

The results of Section II enable a simple characterization of different performance measures of isometric precoded 
multi-user systems with large dimensional quasi-static or fading channels, some of which were introduced in 
Section I. In the following, we apply these results to three practical examples. 

A. Uplink orthogonal SDMA with inter-cell interference 

In this first example, we apply the theoretical results of Section II under the quasi-static channel scenario 
(hypothesis (ii-a)) to the uplink channel of an orthogonal SDMA scheme with inter-cell interference. We consider 
a three cell system with one active user terminal (UT) per cell. The UT in cell k is equipped with Nk transmit 
antennas. We focus on the central cell, whose base station (BS) is equipped with N antennas, and assume that the 
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signals received from neighboring cells are treated as noise. This setup is schematically depicted in Figure 1. The 
received signal y at the BS reads 



with H t g C NxNi the channel matrix from UT i to the BS, x 4 ~ eN(0,I n J the transmit symbol of UT i, 
W, g QNixm ^ e j sometr j c p r ecoding matrix composed of n, orthogonal vectors and < a < 1 an inter-cell 
interference factor. The vector z G C combines the inter-cell interference and the thermal noise. The covariance 
matrix Z g <C NxN of z is given as 



We assume an SDMA system with channel matrices g (£ NxNk generated as realizations of a random standard 
Gaussian matrix with entries of zero mean and variance 1/N k . For simplicity, we further assume that each UT uses 
rik = n different transmit signatures to which it assigns equal unit power, i.e., Pfc = I„. Under these assumptions, 
the mutual information In {a 2 ) of the central cell when the interference is treated as noise is given by 



According to [34], the spectral norm of HfeHjJ is almost surely uniformly bounded. For such channel realizations, 
we are therefore in the conditions of Theorem 3. As a consequence, In (a 2 ) — In{<t 2 ) 0, with Jjv defined in 
Theorem 3 (termed iff). An approximation of the SINR at the output of the MMSE receiver for the jth entry 
of x 2 can also be computed directly by Theorem 5. We assume a = 0.25, N — 16, N\ = N 2 = N 3 = 8 and 
define SNR = l/er 2 . We consider a single random realization of the matrices H fe , which is assumed to be static 
and therefore deterministically known. 

Figure 2 depicts ijv(er 2 ) and the deterministic equivalent In{<t 2 ) versus SNR for different values of n e {1, 4, 8}, 
scaled to bits/s/Hz instead of nats. Note that for the case n = 8, the matrix Bjy and, thus, the mutual information are 
deterministic (see Remark 4). We observe a very accurate fit between both results over the full range of SNR and 
n. This validates the deterministic approximation of the mutual information for systems of even small dimensions. 
It appears that, at moderate SNR, i.e., when noise dominates interference, the results suggest that using all available 
data streams (or orthogonal transmit signatures) maximizes the rate of the central cell. On the contrary, at high 
SNR, the achieved mutual information is maximal when fewer than N transmit signatures are used. These results 
corroborate the observations of [20]. Additionally, we can perform optimal stream control by numerical comparison 
of the deterministic equivalents of the achievable rates for each n. Such an optimization is performed in Section 
III-C for the two-user interference channel. 

In Figure 3, we compare the per-receive antenna sum rate Rn(<t 2 ) with single-stream MMSE-detection to the 
associated deterministic equivalent Rn(ct 2 ), for the same system conditions as in Figure 2. The sum rate _Rjv(ct 2 ) 



y = H 2 W 2 Pf x 2 + V5HiWiP?xi + ^H 3 W 3 Pf x 3 + n 



Z = Ezz H = a 



[HiWiPiW^H" + H 3 W 3 P 3 W£H?] +a 2 I N . 
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is explicitly given by 

1 ™ 

k=l 

with 7^(cr 2 ) defined in (9) (termed ^fj^ (c 2 ))- As for Rn(<7 2 ), from Theorem 5, it reads 

iMa 2 ) = c 2 c 2 log f 1 + _ a f"l ) 

with a2(cr 2 ) and a2(<r 2 ) defined in Theorem 7. For the case rt = 8, we have used a,k = 1 to compute the deterministic 
equivalents (see Remark 4). Similar to the previous observations, the deterministic equivalent provides an accurate 
approximation for all values of SNR and n, although the precision is slightly less than for the mutual information 
in Figure 2. The same conclusions regarding optimal stream control also hold for the MMSE decoder, where we 
confirm an interest to perform stream control when the interference dominates the background noise. 

B. Multiple access channel 

In this and the following example, we apply the theoretical results of Section II under the fading channel scenario 
(hypothesis (ii-b)). We consider a MAC from three transmitters to a single receiver as shown in Figure 4. The channel 
from each transmitter to the receiver is modeled by the Kronecker model (see Remark 1) with individual transmit 
and receive covariance matrices and and we assume additionally a different path loss ctk > on each link. 
The received signal vector y for this model reads 

3 

y = E V^Rjz fc Tf W fc P|x fe + n 

k=l 

where x& ~ CN(0,lAr fc ) and n ~ CN(0, ij 2 1n)- We create the correlation matrices according to a generalization 
of Jakes' model with non-isotropic signal transmission, see, e.g., [35], [36], [37], where the elements of Tfc and 
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Fig. 3. Sum rate Rpf(a 2 ) at the output of the MMSE decoder for user 2 versus SNR for different numbers of transmit signatures n, N = 16, 
Ni = 8, Pi = I n , ct = 0.5. Error bars represent one standard deviation on each side. 



WxP^x 



W 2 P 2 2 x 2 



W 3 P|x 3 




Fig. 4. MIMO MAC from three transmitters (k = 1, 2, 3) with Nf~ antennas to a receiver with TV antennas. Each transmitter sends streams 
with precoding matrix and power allocation over the channel ^/aifcHfc. 



Ra: are given as 



' max 



mm 



it,fc 

max 



'max 



mm 



i2?r 



where (6^, #max) and (0^,, #max) determine the azimuth angles over which useful signal power for the fcth 
transmitter is radiated or received, and are the distances between the antenna elements i and j at the 
fcth transmitter and receiver, respectively, and A is the signal wavelength. We assume uniform power allocation for 
all k, i.e., = £-ln k , and define SNR = l/cr 2 . All other parameters are summarized in Table I. 

Figure 5 compares the normalized mutual information I^{a 2 ) and the normalized rate with MMSE decoding 
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TABLE I 

Simulation parameters for Figure 5: N = 10, d\- = 8X(i — j) 



k 


N k 


nk 


mm 


"max 


mm 


"max 


,t,k 




1 


10 


8 





tt/2 


-7T/4 





4A(i - j) 


1 


2 


5 


4 


-tt/4 


tt/4 





tt/3 


4A(i - j) 


1/2 
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5 


4 


-7T/2 





-7T/3 


tt/3 


4A(i - j) 


1/2 



N 

O 



SNR [dB] 

Fig. 5. Comparison of the average normalized mutual information ijv(o" 2 ) and the normalized rate with MMSE decoding R^{a 1 ) with their 
deterministic approximations Jjv(o" 2 ) and Rn(ct 2 ). Error bars represent one standard deviation in each direction. 

Rn{v 2 ), averaged over 10,000 different realizations of the matrices Hfc and Wfc, against their deterministic 
approximations In (a 2 ) and Rn(ct 2 ). Although we have chosen small dimensions for all matrices (see Table I), the 
match between both results is almost perfect. Also the fluctuations of I/v(cr 2 ) and Rm{u 2 ) are rather small as can 
be seen from the error bars representing one standard deviation in each direction. 

C. Stream-control in interference channels 

Our last example considers a MIMO interference channel consisting of two transmitter-receiver pairs as depicted 
in Figure 6. The received signal vectors yi,ya € C N are respectively given as 

yi = HnWiPjxi + H 12 W 2 P|x 2 + m 
y 2 = HaiWiPjxi + H 22 W 2 Pf x 2 + n 2 
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Fig. 6. Interference channel from two transmitters with JV fc (k = 1, 2) antennas, respectively, to two receivers with N antennas each. Each 
transmitter sends n k independent data streams to its respective receiver. 



where H qk g C****, W fe e C NhXNh , x fc ~ G3<(0,I Nk ), P k € R+" xNk satisfying ^trP fe = 1, and n k ~ 
CN(0, <7 2 In), for q, k e {1, 2}. Assuming that the receivers are aware of both precoding matrices and their respective 
channels but treat the interfering transmission as noise, the normalized mutual informations between Xi and yi, 
and x 2 and y 2 , are respectively given as 

h{a 2 ) = ^ log dot (l N + ^ ^H lfe W fe P fe WjX^ - ^ dot (l N + -^H 12 W 2 P 2 W 2 H H^ 

Ho 2 ) - ^logdet (l N + ^E H2fcWtPtW ^ H ^) - ^logdet (l N + I^WtPiW^- 

We adopt the same channel model as in Section III-B, where the channel matrices H qk are given as 

i i 

Hqk = R-gfe Z 9feTfc 

where Z qk g C WxArfc have independent Of(0, 1/N) entries and T k and K qk are calculated according to (11). We 
assume that no channel state information is available at the transmitters, so that the matrices are simply used 
to determine the number of independently transmitted streams: 

( ) 

\ n k N k -n k J 

We will now apply the previously derived results to find the optimal number of streams (n*,n 2 ) maximizing the 
normalized ergodic sum-rate of the interference channel above. That is, we seek to find 

«,n*) = maxEfJi((r J ) + I 2 (a 2 )] 

s.t. 1 < m < N-i, 1 < n 2 < N 2 

where the expectation is with respect to both channel and precoding matrices. Due to the complexity of the random 
matrix model, this optimization problem appears intractable by exact analysis. At the same time, any solution based 
on an exhaustive search in combination with Monte Carlo simulations becomes quickly prohibitive for large Ni,N 2 , 



Pfe = — diag 

n k 
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since Ni x N 2 possible combinations need to be tested. Relying on Theorem 4, we can calculate an approximation 
of E [li(a 2 ) + I 2 (<r 2 )] to find an approximate solution which becomes asymptotically exact as N\ and N 2 grow 
large. Thus, we determine (n*,n 2 ) as the solution to 

{n\,nl) = max Ii(er 2 ) + I 2 (a 2 ) 

n 1 ,n 2 

s.t. 1 < m < N-l, l<n 2 <N 2 

where Ii(a 2 ),I 2 (a 2 ) are calculated based on a direct application of Theorem 4 to each of the two log-det terms 
in h(<7 2 ) and I 2 (<r 2 ), respectively. The optimal values (nl,n 2 ) are then found by an exhaustive search over all 
possible combinations. Although we still need to compute Ni x N 2 values, this is computationally much cheaper 
than Monte Carlo simulations. Although Theorem 4 does not hold for the case rn = Ni in general, we can compute 
a deterministic equivalent of the mutual information by letting bk = 1 since = Ijv fc (see Remark 4). In this 
case, the matrices W fc vanish and /^(cr 2 ) reduces to the deterministic equivalent of the mutual information of a 
channel with a variance profile as given by Theorem 10 in Appendix G. 

Figure 7 and Figure 8 show the average normalized sum-rate E [li(a 2 ) + I 2 (a 2 )] and the deterministic approxi- 
mation I\{<7 2 ) +I 2 (<7 2 ), by Theorem 4, as a function of (m, n 2 ) for the simulation parameters as given in Table II. 
We have assumed SNR = OdB and SNR = 40 dB in Figure 7 and Figure 8, respectively. In both figures, the 
solid grid represents simulation results and the markers the deterministic approximations. We observe here again 
an almost perfect overlap between both sets of results for all values of (ni,n 2 ). The optimal values (n*,n 2 ) and 
(n\, n 2 ) coincide for both values of SNR and are indicated by large crosses. At low SNR, both transmitters should 
send as many independent streams as transmit antennas, i.e., ri\ = n 2 = 10. At high SNR, one transmitter should 
use only a single stream (n 2 = 1) and the other transmitter n\ = N — 1 = 9 streams. These results are in line with 
the observations of [20]. 

Obviously, the last optimization problem is highly unfair and better solutions can be achieved by using different 
objective functions, such as weighted sum-rate maximization. Also optimal stream-control with MMSE decoding 
could be carried out in a similar manner. Although we would still need to perform an exhaustive search over 
all possible combinations of ni,n 2 , the computations based on deterministic equivalents are significantly faster 
than simulation-based approaches. The development of more intelligent algorithms to determine (n*,n 2 ) is outside 
the scope of this paper and left to future work. The extension to more than two transmitter-receiver pairs is 
straightforward. 

IV. Conclusions 

In this article, we have studied a class of wireless communication channels with random unitary signature or 
precoding matrices over quasi-static and fast fading channels, assuming either single or multiple users and cells. 
For this wide range of system models, we have provided deterministic approximations of the mutual information, 
the SINR at the output of the MMSE receiver, and the associated sum-rate. These approximations were shown 
to be asymptotically accurate as the system dimensions grow large, and to be based on fixed-point solutions of a 
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TABLE II 

Simulation parameters for Figure 7 and 8: N = 10, d[' fe = 4A(i - j), d\'f = 4A(i - j) 



(g,k) 


N k 


e t,k 

min 


nt,k 


nr,q,k 
min 


ar,q,k 
"max 


(i,D 


10 





tt/2 


-tt/4 





(1,2) 


10 


— 7T/2 








tt/4 


(2,1) 


10 





?r/2 


-7I-/3 





(2,2) 


10 


-tt/2 








tt/3 



set of fundamental equations. Practical applications of these results were then proposed in the contexts of multi- 
cell SDMA with unitary precoders under multi-cell interference, MIMO-MAC with random unitary precoding, and 
interference channels with random beamforming. Simulations of the system performance demonstrate the accuracy 
of the approximations even for systems of small dimensions. Moreover, the deterministic equivalent framework was 
used to derive the sum rate maximizing number of streams to transmit in interference channels, which is intractable 
to solve by exact analysis. Lastly, we have proposed a novel technical method for the analysis of matrix models 
featuring random isometric matrices which goes beyond the current reach of classical free probability approaches. 
However, the proof for the case lim sup Cj = 1 and arbitrary power allocation for different streams, i.e., the precoding 
matrices Wj are square and ^ Pk^N k , remains an open problem which might be solved with different methods. 
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Fig. 7. Sum-rate versus number of transmitted data-streams (m,ri2) for SNR = OdB and all other parameters as provided in Table II. Solid 
lines correspond to simulation results, markers to the deterministic approximation by Theorem 4. As expected, both transmitters should send 
the maximum number of independent streams. 




Fig. 8. Sum-rate versus number of transmitted data-streams (111,112) for SNR = 40 dB and all other parameters as provided in Table II. Solid 
lines correspond to simulation results, markers to the deterministic approximation by Theorem 4. As co-channel interference is dominant, there 
is a clear gain of limiting the number of transmitted streams. 
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Appendix A 



Spectral approximation of B n in the quasi-static model 



This section is dedicated to the proof of Theorem 7 as given below. This theorem is the cornerstone result for 
all other results derived in this article. The proof is based on the Stieltjes transform method which is extensively 
documented in [26], [11]. 

We first remind some elementary notions which are needed in the following. For a Hermitian matrix A e <C NxN 
with eigenvalues Ai < . . . < A at, we denote by F A the empirical spectral distribution (e.s.d.), defined as 



We now recall the definition of a Stieltjes transform. 

Definition 1: Let F be the distribution function of a probability measure with support S. Then, the Stieltjes 
transform of F, denoted m F , is the function 



In particular, for F A the e.s.d. of a Hermitian matrix A, 

m F A (z) = tr (A — zIat) -1 

which will often be denoted m A . 

In the course of the derivations, some defining properties of the Stieltjes transform will be needed. These are 
provided in Lemma 1 (Appendix F). 

Theorem 7: For i e {1, . . . , K}, let Pi e (Q™* x "* ^ e a Hermitian nonnegative matrix with spectral norm bounded 
uniformly along rii and W, <E C^''*"' be rii < Ni columns of a unitary Haar distributed random matrix. Consider 
Hi e C^^' a random matrix such that Ri = HiH^ e has uniformly bounded spectral norm along N, 

almost surely. Define a = j^, Ci = jf-, and denote 



and Fn the e.s.d. of B^. Then, as — >• oo, with Cj and q satisfying < liminf q < lim sup ci < oo and 
< lim inf q < lim sup ^ < 1 for all i, the following limit holds true almost surely 



Fjf — Fn =>■ 

where Fn is the distribution function with support on R + and Stieltjes transform fhpf(z), z e C \ M. + . The latter 



1 

^ A w = ^E 1 {A i <*>(*)- 



m F : C \ S -> C 




K 



Bjv = ^2 HiWiPiW^Hj 1 





(12) 
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where (z i-> ei(z), . . . ,z h-> e x (z)) e J{(C \ M + , C) K are defined, for z e D, as the unique solution of the 
following system of equations 

&i(z) = ^ytrPi (ej(z)Pj + [cj - ei(z)ej(,z)]I ni ) _1 



TV 



e l (z) = — tr Rj J ^ ej ( 2 ) R j - z ^ 
V=i 

such that, for z < 0, < e^z) < QCi/e^z), for all i, explicitly given by: 



(13) 



Bi{z) = lim e\ (z), ei(z) = lim e\ (z), e\ (z) = lim e\ ' (z) 

t— >oo t— >oo k— >oo 



where, for k > 1, 

ef ) (z) = ltrR 4 (X:ef- 1) (z)R,-zI Ar 



-1 

AT 



g<t„ 



«■*>(*) = 1 trP, (e«(,) Pi + [c- - e W(^)ef ' fe - 1) (^)]In i )" 1 
with the initial values ,0 ^(z) = and e-°^(z) = for all i. Moreover, (z i-> ei(z), ...,z^ e if( 2 )) € S(K+) X - 

Remark 6: Denoting a^er 2 ) = e*(— cr 2 ) for cr 2 > 0, we see immediately that Theorem 7 encompasses Theorem 1 
as a special case. 

We first provide an outline of the proof for better understanding. The full proof will be given in Appendix A-B. 

A. Sketch of the proof 

As a first step, we wish to prove that there exists a matrix F of the form F = J2iLi /i R i> with fi <= C, such 
that, for all nonnegative A with ||A|| < oo uniformly on N and z < 0, 

^ tr A (B N - zIn)- 1 - 1 tr A (F - zl N y l ^ 0. 

Taking A = R^ and denoting fi = h trR, (Bjv — zIn)~ , we will have in particular that 



i ( K - v 1 

fi - — trRi I /j R j - zI n J 



0. 



Contrary to classical deterministic equivalent approaches for random matrices with i.i.d. entries, finding the 
approximation tr A (F — zljy) -1 for tr A (Bjv — zl^)^ 1 is not straightforward. The reason is that, during 
the derivation, terms such as tr (l Ni - TNi~W^) H 4 H (Bjv - zIm)^ 1 H; with the (ljv 4 - WjWj 1 ) prefix will 

naturally appear which need to be controlled. We proceed as follows. 

• We first denote, for all i, 5i = Jv .l n . tr (l N . — WjW 4 H ) (B N — zljy)^ 1 Hj some auxiliary variable. Then 
we prove 

/.-^trR^G-zIw)- 1 ^0, 
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with G = J2f=i 9 'j^-j an d 

m 

(l - a)ci + jf 

where pu denotes the kh eigenvalue of P^, and Si is linked to fi through 

ft 



1-\ 

V 2^=1 l +PilSi iV l=1 L+PtlO* 



• This expression of which is not convenient under this form, is then shown to satisfy 
1 ni 1 

which induces the 2if-equation system 



l 

K 



a.s. 



AT 



Si - ^ trPi (g l p t + [ci - minr 1 ^ 0. 

These relations are sufficient to infer the deterministic equivalent, but will be made more attractive for further 
considerations by introducing F = Y^f=i fi^U> an d proving that 

x -1 

K 



fi - ^ trP, (/ S P, + - fJi]I ni ) _1 = 0, 



where, for z < 0, fi lies in [0,CjCj//j) and is now uniquely determined by fi. In order to establish this 
convergence, it is necessary to define an analytic extension of fi in a neighborhood of K_. The function /, 
can be immediately extended to C \ M+ where it verifies the properties of a Stieltjes transform of a finite 
measure supported by R + . 

This is the very technical part of the proof. We then prove in a second step the existence and uniqueness of a 
solution to the fixed-point equation 



i ( K Y 1 

ej - — tr Kj I y~] ejTLj - zI N 1 =0 

e» - trP, {e i P l + [c, - e^I^) -1 = 0, 

for all finite N, z < and for € [0, CjCj/ej). This unfolds from a property of so-called standard functions. We 

will show precisely that the vector application h = (h\, . . . ,Hk) defined for z < by 

\ -l 

K 



hi : (xi, . . .,x K ) i-> ^ trRj ( "^xjRj - zl 



N 

with Xi the unique solution to 



N 

3 = 1 
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lying in [0, CiCi/xi), is a standard function. It will unfold, from [38, Theorem 2], that the fixed-point equation in 
(ei,... , ex) has a unique solution with positive entries and that this solution can be determined as the limiting 
iteration of a classical fixed point algorithm. We will further establish that the e^{z) are Stieltjes transforms of 
finite measures supported by R + which satisfy the fundamental equations for z e D. 
The last step proves that the unique solution (ei, . . . , ejv) is such that 



ei - fi ^ 0, 

which is solved by standard arguments. This will entail immediately by classical complex analysis arguments that 
m N (z) — fh N (z) —> for all z e C \ K + , form which the almost sure convergence F N — F N => unfolds. 

B. Complete proof 

We remind that, as N grows, the ratios Cj = j£ for i = {1, ... , K} satisfy 

limsupcj < 1. 

N 

We also assume for the time being that for all i, ||Rj|| is uniformly bounded. The case where ||R;|| is uniformly 
bounded only in the almost sure sense will be treated subsequently. 

Step 1: Convergence 

In this section, we take z < 0, until further notice. Let us first introduce the following parameters. We will 
denote P = maxi{limsup ||Pi||}, R = max;{limsup ||Rj||}, c + = maxi{limsup c{\, c_ = minijliminf Ci} and 
c + = maxijlimsupci}. 

Let A e C A ' X ' V be a Hermitian nonnegative definite matrix, satisfying ||A|| < A < oo. Recall the definition 
R 4 = Hflf. Taking G = £f =1 g^, with gi, . . . ,g K scalars left undefined for the moment, we have 

^ tr A{B N - zLv)- 1 - i tr A(G - zLv)- 1 



(<0 1 , 



K 



(*>) 



A(B N - zIn)- 1 h < (-WjPiWj 1 + gi! Ni ) H 4 H (G - zLv)- 1 
»=i 

i i K n * 

txA{B N - zI Ar )- 1 R,(G - zLv)- 1 - - J2J2p*' w " U i( G - ^) _1 A(B A r - zIn^H^ 

i=i i=i i=i 

f s 1 trAfB zI r i R (G zI r i 1 y y PiiwSHT (G - zI w )^A(B (M) - zI N )-^ 
^ 9l -trA(B N -zl N ) R i{ G-zl N ) - - 1^ ! + p , w h H H( B(m) - .I^-H.w, 



(c) 

i=l " i=l i=l 

(14) 

with e C^ 1 the Zth column of Wj, . . . ,Pi Bj the eigenvalues of Pi and B^j) — Bat — pjjHjWjjw^H^. 
The equality (a) follows from Lemma 2, (6) follows from the decomposition WjPiW" = X)"=i^ w ^ w i]' while 
the equality (c) follows from Lemma 3. 

The idea now is to infer the values of the g~i such that the differences in (14) go to zero almost surely as N 
grows large. We will therefore proceed by studying the quantities wJjH^B^j) — zIjv) _1 HjWj; and wJ]H^(G — 
zlAr) _1 A(B(j ;) — zIjv) _1 HjWi; in the denominator and numerator of the second term in (14). 
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For every i € {1, . . . , K}, denote 



■ tr (l Nt - W.Wf) H 4 H (B N - zIn)- 1 H, . 



(15) 



Ni - Hi 

Introducing the additional term (G — zIat) _1 A in the argument of the trace in Si, we denote 
ft = tr (1^ - W t W, H ) H t (G - zIn)- 1 A (B N - zl^ 1 H 4 . 

l*i fli 

Under these notations, according to Lemma 5, the quantity wJ]H^ (B^n — £ljv) _1 HjWi/ is asymptotically close 
to Su and, if G is independent of wu, the quantity w^H^(G — zl7v) _1 A(B( i ; ) — zIjv) _1 HjWj/ is asymptotically 
close to f3i. 

We also define 

(16) 



fi - ^ trR « ( B N - zI N ) 1 



for z e C \ K + . Note that f l (z) > for z < 0. Remark first, from standard matrix inequalities and the fact that 
w H Aw < ||A|| for any Hermitian matrix A and any unitary vector w, that we have the following bounds on Si, 

ft and f h 

ft^. ft -^\- 

From Lemma 3, we have that 

(1 - Cl )cA =fi-jj^2 w[]H t H (Bat - zIn)- 1 H iWii 



i=i 



1 n 4 

* - E 



w^Hf (B (M) - gig) 1 H iWli 
fci 1 +ft;w^H J H (B (i)l) - zljv) 1 HjWj, 



(17) 



Since z < 0, Si > 0, and 1+ p. lS . is well defined. By adding the term i+p ,s- on ^ otn sides. (17) can be 

re-written as 



^-^-f^^irksi 



N 



N 



E 

i=i 

rii 

E 

i=i 



' H H, H (B 



W) 



zl 



AT 



-1 - 



l+pu6i l+^ w [jH 4 H (B (ji0 -zIat) 'H,w 



' H H, H (B 



(i,i) 



zl 



N 



H,w 



(»,/) 



2l 



AT 



1 HjWi^ 



We now apply Lemma 5 and Lemma 7, which together with Si < R\z\ 1 ensures that 

41 

1 A 

(1 - CiY^i - fi + 



E 



1_ \ - Sj 
Nj^l+puSi 



< 



C 



N 2 



(18) 



for some constant C > 0. This determines the asymptotic behavior of St and, thus, the asymptotic behavior of the 
quantity w^H^B^;) — zlN)~ 1 HiWu in the denominator of (14). 
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We now proceed similarly with ft as with Si. Assuming first that G is independent of w«, we obtain 

1 



Pi = 



Ni - n t 
1 



tr H 2 H (G - zIn)- 1 A (Bat - zl^)" 1 H, 

wJ]H? (G - zIat)- 1 A (B fi , n - zljvp H lW!i 



Nt - n 



■ H HH(G-zI jV )- 1 A(B (M) -zI jV ) ] 
1 +p 4i w 4 H ; H 4 H (B CM) - zljv) -1 H t w tl 



from which we have 
1 

Ni - n, 



tr H, H (G - zl^)- 1 A (Bjv - zljv)" 1 H; - — — — £ - 



l 



E 



W H H H (G - zl 



A' 



A (B 



(»,0 



1=1 



H 4 w j( 



1 + KiwJjHf (B (M) - zIat) H lWii 



Pi 

1+Pu8i 



Pi 



With the same inequalities as above, and with 

wSH? (G - zl^v)- 1 A (B (M) - zIat)" 1 H jWji < 



M2 



we have that 



E 



= S 



w»H? (G - zl^)" 1 A (B 



(i,i) 



zljv) HjWij 



l+^w^H, H (B (M) -zI w ) 'Hiwa 1+Puti 
wljHf (G - zIn)- 1 A (B (i;0 - zIat)" 1 H lWli - ft 



(1 +pu5i)(l +p, ! w i l jH, H (B (iiJ) - zljv 

Pi A 



+ 



+- 



W HH 4 H (G - zl^)- 1 A (B (M) - zl 



HiW i; 
l 



(1 + Pu5i)(l + puw^H? (B (jiJ) - zljv) 'H.w,, 

r -i 1 4 ' 

p iZ ft <5, - w"H t H (B (M) - zljv) H iWl( 



(1 +Pii<Si)(l +PiiwJ}Hj l (B (i>J) - zljv) H,w 4/ ) 



< 



'iV 2 



P 4 i? 4 / A 4 

1 ~t~ 1 i~T 



for some C" > C. Multiplying (19) by 



Ni 



N 



S we obtain 



E 



< 



1 tr H 4 H (G - zl^)- 1 A (B N - zl^ 1 H, - ft ^(1 - Cj )c, + 1 ^"5^ 



TV 



P 4 i? 4 



^ 4 



(19) 



(20) 



(21) 



This provides the asymptotic behavior of ft or equivalently of the quantity wJjH-^G — zI A r) _1 A(B( i ^ — 
zlAr) _1 HiWi; in the numerator of (14). 

We are now in position to infer the gt such that tr A(Bjv — zIn)^ 1 — tr A(G — zljy) -1 is asymptotically 
small. For the previous derivations to hold, the scalars g k , k € {1, . . . , K }, were assumed independent of Wu. 
It is however easy to see that these derivations still hold true (up to the choice of larger constants C, C) if 
9k = 9k^ + £ k jv w i m independent of wu and \s k l %\ < C"/N, for C" constant independent of k,i,j. This 
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follows from the fact that 



We choose 



K 



K 



^2 9 A - ^3fe°R* 



fc=l 



k=l 



K 



E e *5v R * 



k=l 



< 



KRC" 
N ' 



]v E r 



Pfer 



(22) 



(1 - C fe ) C - fe + £ ££ =1 ^ 1 + 

and remark that g k — = 0(1 /N) with <^ defined similar to g k (22), with column w i; removed from the 
expression of Bat. Indeed, when wu is removed, p im — and Si = are no longer defined, while the term 5^ l \ 
k ^ i, defined equivalently as g k l \ satisfies \5 k lV> — Sk\ < (i-c fc )| z | fr° m Lemma 7, from which the result 
unfolds. 

Coming back to the original object of interest, we now have 



N 



tr A(B 



N 



zl 



n)- 1 - 4trA(G-zI JV )- 1 



N 



K 



= E 



E 



Pi/ 



tr H 4 H (G - zlw)- 1 A (Bjv - zl^ 1 H, 



PijwJjH^G - zI 7V )- 1 A(B (4 , - zIjf^HiWu 



N t{ti 1 +Kw ! l jH ! H (B (M) - zIjvJ-iHiWi, 



A" 



E^E^ 



£ trHH (G - zInY 1 A (Bat - zl^)' 1 H t wgH^G - z! jv )- 1 A(B (M) - zI^^H^ 



Notice now that 1 + > 1 and 

(i - cog, < (i - C4 )c 4 + 1 g TT ^- < q 

which ensure that we can divide the term in the expectation of the left-hand side of (21) by 1 + puSi and (1 
Ci)c.i + jj Y^iLi i+p t s- w i tnout taking the risk of the denominator getting close to 0. This leads to 



E 



V tr H, H (G - zIn)- 1 A (B n - zl^)" 1 H, 



N 



l+Pil5i ((1 - Ci)-Ci + £ E£i t4ia) (1 +P»8i) 



< 



a 



P 4 R 4 



i + 



A* 



(23) 



From (20) and (23), we therefore have 



E 



trH^G-zlArj-'ACBjv-zlAr)- 1 ^ wHH J H (G-zI 7V )- 1 A(B (M) - zI n ) 1 H lWji 



((l - Cl )c t + i Er=i j+ks-) (i + Pii*i) 1 + ^ w " H " (B(i,o - ^) 1 H < w ^ 



< 128- 



'7V 2 (l-c,) 4 cf 
We finally obtain 
1 



1 + 



P 4 R 4 



1 + 



E 



— tr A(B W - zIn)- 1 - — tr A(G - zljv) 



< 128X 4 



C" 



iV 2 (l - c+) 4 c 4 _ 



P 4 i? 4 



1 + 



A 4 



(24) 
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This provides a first convergence result as a function of the parameters S i7 from which a deterministic equivalent 
can be inferred. Nonetheless, the expression of gi is rather impractical as it stands and we need to go further. 
Observe in particular that gi can be written under the form 

Pil 



9i 



n^((i- Ci y Ci + i EvU Y+tn) + PiMi 1 c ^ + £?'=! i+^U) ' 

We will study the denominator of the above expression and show that it can be simplified to a much more attractive 
form. 

From (18), we first have 

r / „ v 4i 

8C 



ft -5d(i- a)ci + 



N 2 ' 



Multiplying (22) by — 8i ^(1 — Cj)c, + Y^i=i i+p t s- ) m ^ a dding Cj to both sides yields 

/ 1 ™ s 1 \ 1 ™ s 1 

c ~* " " c ^ + n g TT^5- J = (1 - c * )<5 ' + ]v g TT^ 

By definition, ^ < ^J^g. , and we therefore also have 

-t—) 



(25) 



(ci - /iff,) - (1 - Ci)ci + 



, C P 4 

'iV 5 (1 - C + ) 4 ci 



(26) 



The equations (25) and (26) can now be used to approximate the denominator of gi as follows 



E 



1 n 4 



J^°i~ fi9i +Pilfi 



E 



N 



Pa 


fi Si((l Ci) Ci + ^£j4i 


+ 


C l /ift ((1 C l) C ! + iV El' = l l+p;,,*;) 




[(1 +PiI*i)((l Ci) Ci + ^ X)//=l 1+^,5.) 


[ci - fi9i+Pufi] 



(27) 



Before providing a useful bound, we need to ensure here that the term Cj — +Pufi is uniformly away from 
zero, for all random /, and for all TV. For this, we recall the bounds < fi < ^ and < gi < ^- . 

, so that Cj — /igi > e for all i. From 



Let us consider < e < 1 and take from now on z < 



RP 



(l-c+)c_(c_-e) 



(25), (26) and (27), we have 



E 



9i ~ 



1 «, 

AT E 



< 64 



C 



p8 



N fr[ c i- fcgi+Ptif; 

which is of order 0(1/7V 2 ). 

We are now ready to introduce the matrix F. Consider 



iV 2 (1 - c,) 4 c 4 e 4 V (1 - q) 4 c 



K 



»=i 
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with fi defined as the unique solution to the equation in x 

X = — V — 

within the interval < x < CiCi/fi. To prove the uniqueness of the solution within this interval, note simply that 



Pu 



fi N Ci- fi(ciCi/fi) + fiPu 



Err 



< 



J_ \ - Pu_ 

N 2^ r. - f, • n 



N j^Ci- fi-Q + fipu 

and that the function x M> J2?=i c--f P x+f-p- l * s continuously increasing on x e [0, CjCj//j). Hence the uniqueness 
of the solution in [0,CjCj//j). We also show that this solution is an attractor of the fixed-point algorithm, when 
correctly initialized. Indeed, let xq,xi, . . . be defined by 



1 Pi, 
X n +l — "T7 / , t — 



with x € [O.CjCj//;). Then, x„ e [0,CjCj//i) implies q - /;:r„ + f i p l i > (1 - Cj)cj + > and therefore 
fiX n+ i < Cidi, so Xo,Xi, ... is contained in [0, CjCj//j). Now observe that 

_ Pilfij^n - Xn-l) 

" +1 ™ iV ^ (ci+pufi - fiX n )(ci+pufi - fiX n -i) 
with all terms being nonnegative in the sum, so that the differences x n+ i — x n and x n — x n -\ have the same sign. 
The sequence xq,x\,... is therefore monotonic and bounded: it converges. Calling x^ this limit, we have 

00 N ^[Ct+Pilfi- ftXca 

as required. 

To be able to finally prove that tr A(Bjy — zIat) -1 — ^ tr A(F — zl^v) -1 ^> 0, we want now to show that 
gi — fi tends to zero at a sufficiently fast rate. For this, we write 



E [\Si - / 


.4" 
| 


< 8 ^E 


1 n 4 
JV (=1 Cl 


= 8 ^E 


1 n 4 

i=l 1 



Pll 



Pu 



4" 






+ E 


4" 






+ E 



1 

N ■ 



f-[ Ci ~ f%g% + Pii. h N j^Ci- fifi + pufi 



\9i ~ fi\ 



1 rii 

-Y 



i - fifi + Pufi) {ci - fi9i+Pufi) 



(28) 



where we have simply written gi - fi = {g t - ^ £™=i g^/JiVp,,/, ) + E"=i e.-zJ-Yp^/, ~ ^) and used ^ 
triangular inequality on the fourth power of each term. 

We only need to ensure now that the coefficient multiplying I g~i — fA in the right-hand side term is uniformly 



smaller than 1. For this, observe that, as z — »■ — oo, \pufi\ < 



PR 

1*1 



in the numerator. In the denominator, we 
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already know that Cj - fji + p u fi > (1 - c^q and we also have that Cj - f l g i + p u fi > c~i - (1 J^f)| z | , which i 
greater than some i] > for \z\ taken large. 

Take < r\ < 1 and choose z to be such that, for all i, 



is 



-T 



Pilfi 



N £^(ci~ fifi+ Pilfi) (ci - fi9i+ Pilfi) 

That is, from now on, we take z < min ( — n — P£ 
From the inequality (28), gathering the terms in E 

512 C 



< 



PR 



< 



l-?7 



8PR flP 

c+)c_' (l-c+)c_( 
J |4~ 
ft - Ji 



z|(l - Ci)CiT} 8 
1^))- 



E 



\9i - fi 



< 



p 



s 



on the left side, we finally have 
1 

1 + 



r/ 4 iV 2 (1 - c 4 ) 4 cf e 4 ^ (l _ Cj )4 g 4 
We can now proceed to prove the deterministic equivalent relations: 

^ti A (G - zInY 1 - ^tv A {¥ - zl N )- 1 



(29) 



K m 

i=i i=i 



k tr H, H A (G - zIn)- 1 (F - zl^)" 1 H, £ tr H, H A (G - zl^)" 1 (F - zl^ 1 H. 



JV 



E^E^ 

i=l 1=1 



((1 - + jfTuLi 1+ p. ll5i )0-+Pu5i) ci - fifi^pufi 

( itrH t H A(G-zI A r)- 1 (F-zI Ar )- 1 H, i tr H" A (G - zl^ 1 (F - zl^ 1 H 4 



+ 



((i - Ci )c t + i E"*=i j+^Ki+PiA) c 4 - /ift 

i tr H" A (G - zlw)- 1 (F - zIat)- 1 H, £ tr H 4 H (G - zl^ 1 (F - zl N y l H, N 



Ci - fi9i + Pufi 



K 



1 1 

^ - trH? A (G - zln)- 1 (F - zl^ 1 H,- ]>> 



Ci - fifi + Pilfi 

fi(g~i - fi) 



+ 



{ci - fifi +Pufi){ci - fi9i+ pufi) 
((Ci ~ fi9i) - ((1 - c^ci + £ E?=i T+^)) +^ (fi - - c *)^ + £ E?=i 1+^) 



((1 - c 4 ) Ci + i E"=i i+p^ X 1 + P l i 5 i)( c i - fifi + Pufi) 
Therefore, from (25), (26) and (29), 



E 



1 tr A (G - zljy)- 1 - 1 tr A (F - zl^ 1 



1 



64i? 4 P 4 ^l 4 X C / 
" |z| 8 (l-c+) 8 c^7V2 ^ + (l- c+ ) 4 ci 



1 



64R 4 P 4 " 



which is of order 0(1/N 2 ). 

Together with (24), applying the Markov inequality [39, (5.31)] and the Borel Cantelli lemma [39, Theorem 4.3], 
we finally have 

1 1 , „ „ 

(30) 



^ tr A (Bjv - zIn)- 1 - 1 tr A (F - zl^ 1 ^ 0, 



as N grows large for realizations of {Wi, . . . , W^-} taken from a set A z C f2 of probability one (we use here to 
denote the sample space of the probability space generating the sequences of matrices {Wi, . . . , W^-} of growing 
sizes). This therefore holds true for countably many z (smaller than the established bound) with a cluster point in 
M_, on a set A C ft of probability one. 

Before we can extend the convergence to the entire negative real axis, we need to define an analytic extension 
of fi in a neighborhood of M_. Take D = jz = x + iy : x < 0, \y\ < l^l^^ 1 j- For z e D, the following holds 



Re{fi} > and |Im{/J| < Re{fi} 



1 - Ci 



(31) 
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NxN ■ 
IS 



To see this, consider B w = UDU H the eigenvalue decomposition of Bjy, where U = [ui . . . Ujv] € 
unitary and D = diag(di, . . . , djy) contains the nonnegative eigenvalues of Bjy. Denoting z = x + iy, we have 

fi = ^ trR » (Bjv - zInY 1 

-J-trR 4 (Bjv - zIn)' 1 (B n - z*l N ) (B N - z*^)' 1 



N 



1 \ — ^ ^7 — H „ ,1 V — > 1 u„ 



(32) 



From the last equation, it follows that x < and |y| < imply Re{/i} > and |Im{/j}| < Re{/ i }^ i . 

Consider now the sequence {qi, n }n>o of complex numbers, recursively defined as 



fi 



(1 Ci)Ci + ^ J2l=l 



n > 1 



and qifi = 0. We will now show that \qi, n \ < (i_ c ) g - ^ or a ^ n an< ^ z £ D. First, notice 



that 



\Qi,n\ 



< 



\fi\ 



(1 - Ci)Ci 

whenever Re{<fo )n _i} > 0. After some simple algebra, one arrives at 



Re{q lin } 



Im{*,„} 



Re{/J (1 - cO* + £ £?=i 



|i+p;«?,,„-i| 2 



, J_ Pii(R c {/i}R c {gi."-l}- Im {/i} Im {gi,rt-l}) 

iV 1^1 = 1 |l+p t ,9i,„-l| 2 



(1 Cj)Cj + N J2l=l 1+ Pu qi,n-1 

im{/ t } [(i - Ci )- Ci + i Er=i ^d— tp ] + f Er=i * 



(1 Ci)Cj + ^ J2l=l l+puq^-r 



1 Pil( Im {/i}R c {gi,"-l}+R- c {/i} Im {gi,'»-l}) 
|l+Pii9i,n-l| 2 

2 



(33) 



(34) 



(35) 



(36) 



Now, if we assume Re{<?i,„_i} > 0, we have 

Re{/0(1 - |Im{/<}lwE?=i ii-K^^I 2 



Re{gi,„} > 



> 



(1 Cj)Cj + w X)i = l l+Pi^i,,,.! 

Re{/i}(l - Ci)ci - llm^lciCj 



(1 Cj)c, + ^ X)i=l 1+ Pi ,g ijn _i 

The right-hand side of the last equations is nonnegative whenever 

1 - a 



2 • 



(37) 



|Im{/J| < Re{/J- 



ci 



As this condition is always satisfied for z € D and we have defined g i;0 = 0, we can conclude that (34) and 
Rc{<Zi,n} > hold for all n. 

Additionally, we have from (35) and (36) that 

(Re{/0 2 + Im^} 2 ) [(1 - q)« + £ EF=i tSS^ 1 " 



Re{/i}Re{gi, n _i} + Im^Im^n-i} 



(1 Cj)cj + ^ 5_w=i 



> 0. 



(38) 



33 



Until here, we have proved that is a sequence of bounded analytic functions on z e D (the analyticity 

follows from the fact that /, is analytic on C \ K + and q.^ n is a rational function with no pole in D). Let us now 
focus on the negative real axis, i.e., z < 0, which lies in the interior of D. Here, the following holds 

J_ 1 

/ \ r N 2-~ll = l (l+p i! <2i i „)(l+Pi;IJi > „_l) 

q%,n+l - qi,n — (Qi.n — <Zi,n-l) JiJ ~ " ~ " =TJ Z 1 n- \ T" ' 

(1 - Ci)Ci + J2l=l l+p i!9l „ (1 ~ C i)Ci + jv Z)j=l T+^~7 

As fi and all terms in the fraction of the right-hand side of the last equation are nonnegative, the differences 
<Zi,n+i — qi.n and q it n — qi, n -\ have the same sign. Thus, {qi,n} is either monotonically increasing or decreasing. 
Since {qi, n } is also bounded, it must converge. This implies by Vitali's convergence theorem that {qi, n } converges 
uniformly on all closed subsets of D and that this limit is an analytic function. Call this limit qt = lim„ q itn . 
We now define / ijrl by the quantities f t and q ijn : 

1 1 \ - Piiqi,r, 



Clearly, is a sequence of analytic bounded functions, converging for z e D to 

z A J_ 



(40) 



With the above definition, <7i. n +i satisfies 



9i,ra+l 



(1 Cj)cj + ^ i+pl m , n 
fi 

= i_ v^«i pnqi.n 

/i 



fifi,n 



Thus, we can write, from (40), 



Tt i rii 
Ji,n+1 ~ N ,_ ~ , /_ . „„f, 



Pil 



tt (Cj ~ fifi,n) ( 1 + J^JJ— ) N tt^~ fikn + Pilfi 



(41) 



(42) 



As a consequence, the restriction of fi to z < is identical to /, and the fixed-point algorithm defined by (42) 
with fifl = converges to fi for zefl. From this point on, we therefore extend the definition of fi to D by 

Hz) = fi{z). 

From (38) and for z e D, we have 

R/fl^jLV ^'lg»l 2Re {/0 + Re{/JRe{gJ + lm{fj}lm{q t } > 



Since F = X^fcLi /*3k an d tne matrices are Hermitian nonnegative definite, it follows that j^trA (F — zIn) 1 



i : i-r ■. 'in/1 ini"> nv.nn^iH' k , •» n"» i i rm 1 1 i n n nil n f»n <ii n .■'<"» 1 1 f» 1 1 n 1 1 f» n irtiiAivc nvn ... . , 

AT 1 

^forzeD. 

From the Vitali convergence theorem, the identity theorem, the analyticity of the functions under study, and the 
fact that tr A (B N — zl N )~ l and tr A (F — zl^r) -1 are uniformly bounded on all closed subsets of z e D, 



< 
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we have that the convergence 



-^rtr A (Bat - zI N ) 1 - tr A ^2 -A R * ~ zI n^ 



-i 

a.s. 



N 

holds true for all z e D. 

Applying the result for A = Rj, this is in particular 



fi - 1 tr R, (j2 - zI ^j ^ (43) 



for z € D, where /, is defined as the above limit. For A = Ijv, this implies 



k - - 1 



m N (z) - ^ tr ( /* R i _ zl N ' 



which finally proves the convergence. 



Step 2: Existence and Uniqueness 

We will now prove the existence and the uniqueness of positive solutions e\{z),... ,ex{z) for z < and the 
convergence of the classical fixed point algorithm to these values. In addition, we will show that the ei{z) have 
analytic extensions on C\IR + which are Stieltjes transforms of finite measures over R + and satisfy the fundamental 
equations for z € D. We first introduce some notations and useful identities. Until stated otherwise, we assume 
z < 0. Note that, similar to the auxiliary variables Si and qi in Step 1, we can define, for any pair of variables Xi 
and Xi, with Xi defined as the solution y to y = Y^Ji=i c--x P y+x- P i sucn ^ at — J/ < c jCi/xi, the auxiliary 
variables Ai, . . . , Ak, with the properties 



A, (1 - CiYa + N 



a j — 1 ^ pgAj 

U N^l + puAi 



N^l+puAi) 



and 



1 ^ 1 

Cl Xi Xi-(l Cl )c l+ N 2^ l+piAi 

= c- - - V Pi ' Ai (44) 
First note that mapping between Xi and Ai is unique. This unfolds from noticing, with some abuse of notation, 

A, ((1-^)5 + -^ V - 1 ) = (l - Ci )ci + -j- V j— — 

\ N 1 + puAi I N f— < (1 + P 5 ; 



d 

dAi _ dAi 



and therefore Xi and Aj are one-to-one. Additionally, Xi is a strictly increasing function of A, with Aj = for 
= 0. This ensures that A, > if and only if Xi > 0. 
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Secondly, from the definition of x i7 we have 



Pit 

(. i J, ■; X i C i X j 



Iv E 



' \~- />;/A, _ i _ ^ 

Note in particular that, since x^i — Ym'=i l+p.^k ■ ' tne aDove equation simplifies to 



- A f - _ 1 PilAj \ l_y, Pj( 

l \ Nf^l+puAi i Wf-' (c._± T n< Pu^i ) + p .. A . . _ iv"- ft.'Ai ^ 



1=1 - ' pux 



and therefore c, — l+p^k- * s one °^ tne solutions of the implicit equation in u, 

_-_ JL Pu 

Equivalently, writing u = q — Xiy, it follows that ^--^ i+p^k- ^ s one °^ tne solutions of the equation in y 



1 Pa 
y = — } 

N ^ a, - x t y + puXi 



Since 

| 1 1 ^ Pi;Aj \ 

' ' 1 x~A^ir^A~J <CiC ' 



this solution lies in [0, CiCi/xi) and is exactly equal to Xi. This proves that the equations in (xi, Xi) can be written 
under the form of the equations in (A^x^), as presented above. 

We take the opportunity of the above definitions to notice that, for xi > x- and x' i7 A- defined similarly as Xi 
and Aj, 

" " " JV ^ (1 +^A i )(l + PiI A{) > (45) 
whenever 7^ 0. Therefore XiXi is a growing function of Xi (or equivalently of Aj). This will turn out to be a 
useful remark later. 

We are now in position to prove the step of uniqueness. Define, for i £ {1, . . . , K}, the functions 

1 ( K 
hi : (xi, . . .,x K ) i-^- — trRj ^ XjTLj - z\ t 



N 

with Xj the unique solution of the equation in y 



1 \ 



y = ^E-— ^ w 

such that < 5j < CjCj/xj. 

We will prove in the following that the multivariate function h = (hi, ... , Hk) is a standard function (or standard 
interference function), defined in [38], as follows: 
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Definition 2: A function h(xi, . . . ,%) el is said to be standard if it fulfills the following conditions: 

1) Positivity: for each j, if Xi, . . . , x K > 0, then hj(xi, . . . , x K ) > 0. 

2) Monotonicity: if x\ > x[, . . . ,Xk > then for all j, hj(x\, . . . ,xk) > hj(x[, . . . ,x' K ). 

3) Scalability: for all a > 1 and for all j, ahj(x\, . . . , x^-) > hj(ax\, . . . , axx)- 
The important result regarding standard functions, [38, Theorem 2], is given as follows: 

Theorem 8: If a i^-variate function h(xi, . . . , xk) is standard and there exists (x\, . . . , x^) such that for all j, 
Xj > hj(xi, . . . ,xk), then the fixed-point algorithm that consists in setting 

for t > 1 and for any initial values x^ , . . . , x^ > converges to the unique jointly positive solution of the system 
of K equations 

Xj = hj(X!, . . .,x K ) 

with j e {1,...,K}. 

In order to prove that there exist x\, . . . , xk such that Xj > hj(x\, . . . , xk) for all j, it is sufficient to notice that 
hj(xi, . . . , x K ) < R/\z\ for all j. Thus, for Xj > R/\z\ for all j, Xj > hj(xi, . . . , xk) holds for all j. Therefore, 
by showing that h = (hi, . . . ,hx) is a standard function, we will prove that the classical fixed point algorithm 
converges to the unique set of positive solutions e\, . . . , ex, when z < 0. 

The positivity condition is straightforward as x^ is positive for x^ positive and therefore hj(x\, . . . , x K ) is always 
positive whenever x\, . . . , xk are nonnegative. 

The scalability is also rather direct. Let a > 1, then 

ahj(xi, . . . , xk) — hj(ax\, . . . , axx) 

= jf trR > (e - z ~ l ^j jf trR ^- (E - ziN^j 

where we denoted x^ the unique solution to (46) within [0, CjCj/(aXj)) with Xj replaced by axj. From Lemma 
6, it suffices to show that 

t H°' - 1 

is positive definite. Since axi > x i7 we have from the property (45) that 

aiti[ a) - x k x k > 

or equivalently 

-(a) x k n 

Xj. > 0. 

Along with 1 — 1/a > and z < 0, this ensures that ahj(xi, . . . , x^) > hj(axi, . . . , a%). 
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The monotonicity requires some more calculus. This unfolds from considering x t as a function of A iy by verifying 



that 



j^-x l is negative. 



d\ Xl ~ A? 



1_ v^"i PiiAj 
/V Z^/=l (l+p 4! A 



AT 2^=1 



1_ v^n-i pnAj 
V Z^/=l l+ Pil A, 



^ JV Z^Z=1 l+ Pii A iy l 



^1 ( c « AT 



1+P,;A, 




1 

N 



St 



fezA,) 2 



■Pi/A* 



+ 



TV 



2=1 



(1+P^A,) 2 



E 



Pi/A, 



N 



From the Cauchy-Schwarz inequality, we have 



- (l+p 4i A 4 ) 2 



1 P*A \ < A 1 A (puAj) 2 _ 1 A faA)^ c, (PjA) 2 



(PiiAi) 2 



\Z=1 



(47) 



which is sufficient to conclude that < 0. Since A, is an increasing function of x i7 we have that Xi is a 

decreasing function of ajj, i.e., -£rXi < 0. Therefore, for two sets x\,... ,x K and x[,. . . ,x' K of positive values 
such that xj > x'j, defining x' rj equivalently as Xj for the terms x'p we have x' k > Xk- Therefore, from Lemma 6, 
we finally have 

hj(xi, . . -,x K ) - /ij(xi, . . .,x' K ) 



1 / K 

— trRj ^x fc R fc 

\fc=i 



K 



zl 



N 



AT 3 



ZlN 



\k=l 



> 0. 
(48) 



This proves the monotonicity condition and, finally, that h = (hi, . . . , Kk) is a standard function. 

It follows from Theorem 8 that (a, . . . ,ex) is uniquely defined and that the classical fixed-point algorithm 
converges to this solution from any initialization point (remember that, at each step of the algorithm, the set 
ei, . . . , e.K must be evaluated, possibly thanks to a further fixed-point algorithm). 

We will now show that ei(z) has an analytic extension on z e C \ IR+ which is the Stieltjes transform of a 
finite measure supported by R + . For this proof, consider the matrices P[ p ],i € C™ lP and G ^n p xn iP ^ ^ 

i defined as the Kronecker products P[ p \.i = Pi (E> I p , H-\p\,i = Hj ® I p , such that P[p],i and R[p],j = H^H^ f 
have the same spectral distributions as the matrices and R^, respectively. It is easy to see that the solutions of 
the implicit equations (13) for z € C \ R + remain unchanged by substituting the P[ p ]j and R[p],j to the Pj and 
Ri, respectively, for any p. Denoting similarly f^ ti the /j adapted to P^],, and H^, from the convergence result 
of Step 1, we can choose /[2],»> • ■ • a sequence of the set of probability one where convergence is ensured as 
p grows large (N and the nj are kept fixed). Call e-(z) the limit. 

We wish to prove that e' i7 seen as a function of z, is the Stieltjes transform of a distribution function, whose 
restriction to M_ matches ej. For this, we prove the defining properties of a Stieltjes transform, provided in Lemma 1. 
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By Vitali's convergence theorem [40], e[ is analytic on C + since e- is the limit of a sequence of analytic functions, 
bounded on every compact of C \ M. + . It is clear that for z G C + , Im[/[ p ] A {z)\ > 0, Im^/^j^z)] > and 
\yf\p],i(w)\ < -R for y > 0. This implies that for z G C + , Im[e^(z)] > 0, zlm[e^(z)] > and lim^oo — iye'^iy) < 
R. In addition, note that, for z G C + , 

M/^]>^ (jR p + N)2 M-]>o 

and 

1 Kr 2 t 

Im[zf ^ ] ^N (RP + \z\y Im[z]>0 

with r a lower bound on the smallest non-zero eigenvalues of Ri , . . . , Rjj (we naturally assume all R fe non-zero) 
and t a lower bound on the smallest non-zero eigenvalues of Ti, . . . ,Tk (again, none assumed identically zero). 
Take z G C + and £ < \ min(^ ^ p ^| z ^ 2 Im[z], (n.p+^y im H)- There now exists p such that p > pa implies 
|lm[/[0(p)],j] — Im[e-]| < e/2 and |Im[z/^( p )].j] — Imfze'J | < e/2, and therefore Im[e-] > e/2 and zlm[e£(z)] > e/2 
so that e^(z) is the Stieltjes transform of a finite measure on R + . Moreover, since e^(z) = lim f[ p ]^(z) on D, from 
(43), e-(z) satisfies the equations (13) for all z G D. 

Consider now two sets of Stieltjes transforms (e' 1 (z) 7 . . . , e' K (z)) and (e"(z), . . . , e^-(z)), z G C\M+, which are 
solutions of the fixed-point equation for z < 0. Since e-(z) = e-'(z) for all z < 0, and e-(z) — e"(z) is holomorphic 
on C \ IR+ as the difference of Stieltjes transforms, e-(z) — e'-(z) over C \ M. + [41] by the identity theorem. 
This therefore proves, in addition to point-wise uniqueness on the negative half-line, the uniqueness of the Stieltjes 
transform solution of the functional implicit equation such that, for z < 0, < ei < CiCi/ei for all i. Moreover, 
this solution satisfies the fundamental equations for z G D. 



Step 3: Convergence of /, 

For this step, we follow the same approach as in [5]. Denote 

K 

I „ / ^ 



1 ~ 
~ ~ n tr R M ^ -^ Rfc ~~ zI 



fc=i 



iV 

and recall the definitions of f i7 a, fi and 

fi = ^ trRj (Bjv - zljvr 1 





^ c « .A, fi ~i~ Pilfi 



fi G [O^iCi/Jj) 



1 PZ 

Sj — / — - : , 6« G [U,CiCi/eiJ . 



N j^Ci- eiei+puei' 
From the definitions above, we have the following set of inequalities 



R R - P P 

fi<r~ v e,< n , /*<- — , e 4 <- — . (49) 

\Z\ \z\ (l-Ci)Ci (l-Ci)Cj 
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We will show in the following that 

e l -fi^O (50) 

for all i € {1, ... , N}. We start by considering the following differences 

K -1 ( K \ _1 ( K \ _1 

fi -ei = ^2(ej - fj) — tr Ri I ^ e fc R fe - zI n J R j I X! ^ feRfe ~~ zIn J + £Ar <* 
j=i ' \k=i ) \fe=i / 

1 \^ Puifi ~ e i) ~ Vu [fifi ~ die,] 



N J^((ci- e l e l + puei)(ci - + pufi) 

fifi &i&i fi{fi e i) ~t~ &i{fi ^z) ■ 

For notational convenience, we define the following values 

a ^ S upE[|.A-e,| 4 ] 

i 

a = supE [\fi - e,\ 4 ] . 

i 

It is thus sufficient to show that a is summable in order to prove (50). By applying (49) to the absolute of the first 
difference, we obtain 

KR 2 

\.h - e»| < -j— sup \fi - ei\ + sup |ejv,i| 



and hence 



8K 4 R 8 8C 

a< I I o a+ (51) 



for some C > such that E[sup i \£n,%\ } < 81£'sup i E[|£jv',i| ] ^ C/N 2 . Similarly, we have for the third difference 

\fifi ~ e l e i \ < \fi\\fi - e l \ + \ei\\fi - e l \ 

P R - 

(l-c+)c_ i \z\ i 

This result can be used to upperbound the second difference term, which writes 

\fi - &i\ < Ti ~Y?^T \ P2 SU P \fi ~ e *\ + P \fifi - 

(1 - c+yct v 



P R 

— sup \fi - e»| + j-, sup I/, - e, 

(l-c+)c_ i |z| i 



Hence 



< ir -^ pg -(p»Bup|/ i -e i | + P 
^ P 2 (5- + 1) ., _. 

- (i - c+ r? T L/ * - e *' + |z|(i-c +) ^ s r 1/4 - e *' • 



. ,c_ + l) 4 8fl 4 P 4 

a — /i - M2^12 a + IJ4/1 _ „ , ^8-8 a • 



(l-c+) 12 cL 2 |z| 4 (l -c+) 8 c : 

For any z satisfying \z\> j^z^yi, we have \ z ^fi P c+ ^ < 1/2 and thus 

16P 8 (c_ + 1) 4 

re < — o- 

(1 - c+) 12 c 12 ■ 
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Plugging this result into (51) yields 



128K 4 R 8 P 8 (2-c) 4 8C 

a < —a H 

" |z| 8 (l-c+) 12 N 2 



Take < £ < 1. It is easy to check that for |z| > (1 _ c+)3/2g 3/2 (1 _ e)1/8 , M 8 (i-c + )i2 e i2 < 1 - £ and thus 

8C 

a< ^' (53) 
Since C does not depend on N, a is clearly summable which, along with Markov inequality and the Borel Cantelli 
lemma, concludes the proof. 

Finally, taking the same steps as previously, we also have 



E 



\m N (z) ~ m N (z)\ 



8C 

< 



eN 2 

for some \z\ large enough. For these z, the same conclusion holds: mjv(z) — fhff(z) 0. From Vitali convergence 
theorem and the identity theorem, since fa and are uniformly bounded on all closed sets of C \ R + and analytic, 
we finally have that the convergence is true for all z € C \ R + . The almost sure convergence of the Stieltjes 
transform implies the almost sure weak convergence of Fn — Fn to 0, uniformly over every compact set of R + , 
which is our final result. 

This concludes the proof of Theorem 7 for surely bounded R^. 

1) Almost sure boundedness o/ ||Rj||; 

To extend Theorem 7 to the case where ||Rj|| is only almost surely bounded, we merely apply the Tonelli 
theorem (Lemma 9). Call (f2/j, 3^, Pr) the probability space that generates the sequences of matrices of growing 
sizes {Rj,l < i < K, Ni G N}, (Slw> Pw) me probability space that generates the sequences of matrices 
of growing sizes {W,,l < i < K, TYj S N}, and (flu x Q.w^r x JwiQ) their product space. Denote A the 
subspace of 3r x $w f° r which Fjy — Fn — > 0. Then, from Tonelli theorem, Lemma 9, 

Q(A)= f l A (r,w)Q(d(r,w))= f f l A (r,w)P w (dw)P R (dr). 

Take r such that the ||Ri|| are all uniformly bounded with growing N. Then, from Theorem 7, for this r, 
Jjj lA(r,w)Pw(dw) = 1. But these r e 0^ belong to a space of probability one, as the intersection of K 
spaces of probability one, and finally Q{A) = 1. 

Appendix B 
Proof of Theorem 3 

Following for simplicity the notations of Appendix A, we use here the variable ej(— (J 2 ) in place of ai{a 2 ). It is 
easy to see (e.g. [11, Definition 3.2]) that, for F a probability distribution function with support in R + 
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where m F (z) is the Stieltjes transform of F (this is sometimes called the Shannon-transform in 1/x). In particular, 

= ^ log dot (l N + -^B N ^j = (-^+ m N (-t)j dF N (t). 

We will first show that the expression iff (a 2 ) given in Theorem 3 satisfies the same property with Fn. 
For notational simplicity, we will write = ej(— a 2 ) and e, = ej(— cr 2 ). 

First note that the system of equations (13) is unchanged if we extend the Pj matrices into Ni x Ni diagonal 
matrices filled with Ni — rii zero eigenvalues. Therefore, we can assume that all Pi have size N x Ni although 
we restrict the measure of eigenvalues of Pi to have a mass 1 — Cj in zero. Since this does not alter the equations 
(13), we have in particular Si < Ci/et for a 2 > 0. 



This being said, iff is given by 



iff {a 2 ) = 1 logdct (l N + ± f" e ~* R <) + E 

\ i=l / i=l 



N 



logdet {[a - eiei]I N + e.Pi) - a log(ci) 



Calling / the function 



I : (xi, . . .,x K ,xi, . . .,x K ,v 2 ) 

^ logdet ( In + h E + E 



i=l 



i=l 



— logdet (~Cj - ZjO^IjV + XiPi) - C, log(Cj) 



we have 



di_ 

dxi 

di_ 

dxi 



1 Ni 1 

(ei, ...,e K ,ei,.. .,e K ,(? 2 ) = - ei — V] — 

JV — p-p- 



Ci Gi&i ~t~ &iPil 



1 ^ 1 

(ei, . . . ,ejsr,ei, . . . ,e K ,cr 2 ) = a - e^— V] — 

/V z — ' r. — p.p.- 



In order to proceed, note that we can write Cj in the following way: 



J_ \ ^ Ci ~ 



N f-^ ^ - e^i + e,pi; 



1 jv, 



1 Wi 
+ A? E FT 



e*Pi, 



Ci CiCi ~1~ GiPil N Ci C{Ci + 6iPn 



from which it follows that 



_ _ , 1 ^ 1 

iV ^ Ci - eiei + eip a 

( 1 Ni 1 \ 

- eie 4 ) 1 - — V — = ( 

V N i~[ Ci ~ eiei + eiPu J 

But we also know that < e"j < Cj/ej and therefore q — e^i > 0. This entails 



1 1 

-V 

V ^ p. — p.p.- 



A~ ^ Ci - eiCi + eiPi; 



1. 



(54) 
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From (54), we can then conclude 



5/ 

dxi 

d!_ 

dxi 



(ei, . . . , ex, ei, . . . , e K , o 2 ) = 
(ei, . . . , e K , ei, . . . , e^, c 2 ) = 0. 



We therefore have, from the differentiation chain rule, 

K 



d 4°V) = E 



i=i 
gJ 

9^ 



9/ 9ej 9/ dei 
dei da 2 dei da 2 



+ 



81 

d^ 2 



if -1 



trR, I 



N 



1 * 

12 E g J R j 



i=l 
1 1 



CT 2 JV 



tr 



/ ^ 1 

( E 



+i 



N 



I AT 



I AT 



1 K 



11/ K 



Recognizing the Stieltjes transform of F/v, we therefore have, along with the fact that I^\oo) = 0, 



4V) 



i i 

t ~ t 2 



m N 



dt 



and therefore 



PC 

&°V)= / 

Jo 



log 1 + 



dF N (t). 



In order to prove the almost sure convergence I^\a 2 ) — I^\a 2 ) 



l n K u ) — ± n \ u ! — 7 ®> we simply need to remark that the 
support of the eigenvalues of Bat is bounded. Indeed, the non-zero eigenvalues of WjW^ have unit modulus and 
therefore ||Bjv|| < KPR. Similarly, the support of Fjy is the support of the eigenvalues of Y^f=i e»R»> which are 
bounded by KPR as well. 

As a consequence, for Bi,B2, ... a realization for which i*V — fjv (these lie in a space of probability 
one), we have, from the dominated convergence theorem 

log(l + -V)d[F JV -F JV ](i)->0 



Hence the almost sure convergence of the instantaneous mutual information. 

Because of sure boundedness of ||Bjv||, an immediate application of the dominated convergence theorem on the 
probability space SI that engenders the sequences of matrices Bi(cj), B 2 (w), . . ., w £ SI, entails convergence in the 
first mean as well. 
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Appendix C 
Proof of Theorem 5 

In this section, we follow closely the derivations of Appendix A and use the variable ej(— a 2 ) in place of a,i(<r 2 ). 
To prove Theorem 5, we will pursue a similar approach as for the proof of Theorem 7, but we can now take 
advantage of all results derived so far. 

First denote the unique positive solution, for a > 0, to 



-A ( - _ J_ V Pildi | 



This solution exists and is unique due to the arguments given in the introduction of Step 2 of the proof of Theorem 7. 

Similar to the proof of Theroem 3, we proceed by extending the matrix Pi to an Ni -dimensional matrix with 
the last Ni — rii diagonal entries filled with zeros. This way, we can write 



di 



N 



Ni 

E 



i=i 



Viidi 



1 w * j 



N 



+ Pudi 



Since di is a continuous mapping of e, and a < f^r , it follows that di is bounded from above. 
Recall now that for lim sup Cj < 1 for all i and, for some z < 0, we have that z < z implies 



E[|/ 4 -e 4 | 4 ] =E 



1 N * A 



< 



c_ 

N 2 



for some constant C > 0, where /; is defined in (16). Also, from (18) 

4' 

E 



1 Wl 

^ ~ N E i 

2=1 



PiiSi 



< 



N 2 



for some C\ > C. From these two inequalities, we have 



E 



< 



16Ci 
AT 2 ' 



Also, from an immediate application of the trace lemma, Lemma 5, we remind that 



E 



(i,0 



< 



N 2 



for some C 2 > C\. 
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Together, this implies that for z small enough and for any k € {1, . . . , n k }, 



E 



St 



1 



V 



w iI H i* ( B (»,fc) ^ zI n) HiW ife 



- 1 + p a d i N ^ i +paW HHH ( B(iifc) - zl 



N 



H,w 



< 



< 



E 

+ E 
E 

+ E 

136C 2 

TV 2 



1 ^\ di 1 



Ni 



1 Ni X 

iV ^ 1 

1=1 



N ^ 1 

' l 

Ni 



E 



+ PilSi N^i + KiW H H N (B (i;fe) - zl w ) 1 H 4 w jfe 



1 N t 

-Y — — 



1 Ni 

n E y 



<5, 



5, - w» H? (B (a) 



zl 



H,w 



-Y- 

N tt (l+PiiSi)(l+Pu^ k K? (B(i ;fe) - zl w ) ^.w^) 



This ensures that for z < zq, 

N 



1 (J, 
f-^ 1 



1 ^ 



(B (j>fc) - zI N ) 1 H iWtt 



(55) 



N^l+pudi N^ il+p . lW H kU H {B{ik) _ zlN ^ i HjWife 
irrespectively of the choice of k. 

Since the function / : x i-> i+p lX * s continuous and has positive derivative, it is a one-to-one continuous 

function. Therefore, for B!,B 2 , ... a realization such that the convergence of (55) is ensured, we also have by 
continuity di - w^H^ (B (i)fc ) - zl N ) 1 H t w tt -> 0. Finally, 



d, - w|iH? (B (ij M - zIat) 1 H^Wifc ^> 0. 



(56) 



Noticing from (44) that d l = — 



we have proved the convergence for z < zq. The Vitali theorem then 
ensures that the convergence holds true for all z < since ej and gj have analytic extensions on a neighborhood 
of M_ (see the proof of Theorem 7, Step 1). 

Since the quantities di and w^H^ 1 (B(jM — zljv) 1 H t -w ifc are uniformly bounded for all N (a result that 
holds surely since we assumed the deterministic), the dominated convergence theorem also ensures that the 
convergence holds in the first mean. 

In order to prove Corollary 1 in the almost sure form, we simply invoke the continuous mapping theorem [42, 
Theorem 2.3] for the function <f> : x i-> X^fc=i Y^ii=i l°g(l + Pik%) on the convergence (56). The convergence 
in the mean sense is obtained using the boundedness of di and w^H^ (B(j fe j — -zljv) 1 HiW^ uniformly on N 
and hence the boundedness of their image by <j>. The dominated convergence theorem then gives the result. 
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Appendix D 
Proof of Theorem 2 

It was shown in (46) that, for any fixed b k (<r 2 ) > 0, the following equation in b k {a 2 ): 
h{<r 2 ) = ^trP fc (6fe(a 2 )P fe + [c fc - h(a 2 )b k ((T 2 )] I„ fc ) 

has a unique solution, satisfying < b k (<r 2 ) < c k c k /b k {a 2 ). Thus, b k {<r 2 ) is uniquely determined by b k {a 2 ). 
Consider now the following functions for k € {1, . . . , K} and a 2 > 0: 

h k {x u ...,x K ) ^ ttZ^TVtVt^ 

where & fe <E [0,c fe Cfe/a; fc ) and ( k j{v 2 ) > are the unique solutions to the following fixed-point equations: 

- 1 / - \ _1 

b k = —tiP k [x k P k + [cfc - x k b k ] I„ fc J (57) 



(58) 



Similar to the proof of Theorem 1, it is now sufficient to prove that the if-variate function h : (x\, . . . ,Xk) >-> 
{h\,... ,hx) is a standard function and to apply Theorem 8 to conclude on the existence and uniqueness of a 
solution to x k = h k {xi, . . . ,Xk) for all k. The associated fixed-point algorithm follows the recursive equations 

x k ^ = h k {x\\ . . . , x^ ), k = l,...,K 

for t > and for any set of initial values x^\ . . . , x^P > 0, which then converge, as t — > oo, to the fixed-point. 

Showing positivity is straightforward: For a 2 > 0, we have ( k j{<7 2 ) > by Theorem 9 in Appendix G and 
b k > by its definition. Thus, h k {xi, . . . , Xk) > for all xi, . . . , Xk > 0. 

To prove monotonicity of h k {xi, . . . , x K ), we first recall the following result from (45). Let x k > x' k , and 
consider b k and b' k the corresponding solutions to (57). Then, 

(i) b k < b' k (ii) x k b k > x' k b' k . (59) 

We now prove a further result. Let a 2 > and assume b k > b' k . Consider ( k j{v 2 ) and £jy(<7 2 ) as the unique 
solutions to (58) for b k and b' k , respectively. Then, 

(i) CfcjV) < CfcV) h(kj{<T 2 ) > Vk&jio 3 ). (60) 

Proof: The proof is based on the consideration of an extended version of the random matrix model assumed 
in Theorem 9. Let us consider the following random matrices H k e ^ LNxLN k^ gj ven as 

1 



'LN 



/pi \ 2 rjL (TtL \2 r/L 



(61) 



where = diag(Rjy, . . . , Rjy) G C x are block-diagonal matrices consisting of L copies of the matrix 
l kj and Z^. 



Rfej and Zf • e C are random matrices composed of i.i.d. entries with zero mean, unit variance and finite 
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moment of order 4 + e, for some e > 0. We define the following matrices which will be of repeated use: 
B^ = f>H£(H£) H , B' L = &' fe H£(H£) H + f] ^Hf(Hf) H 

k=l l=l,l^k 

Q = (b l + ^Inl)' 1 , Q' = (b' L + a 2 I Ni ) ~* . 
One can verify from Theorem 9 that for any fixed N, Ni, . . . , Nk, the following limit holds: 

1 _ r f^r o_ N" 1 



-trR^B^Wf ^C*(A 



Thus, any properties of the random quantities on the left-hand side of the previous equation also hold for the 
deterministic quantities Ckj(& 2 )- We will exploit this fact for the termination of the proof. The matrices B L and 
B' differ only by b k . This assumption will be sufficient for the proof since the case 6; > b[ for I e {1, . . . , K} 
follows by simple iteration of the case b[ = for I ^ k and b k > b' k . 
To prove (i), it is now sufficient to show that, for any L, 

ItrRjk (Q - Q') < 0. 

By Lemma 6, this is equivalent to proving (Q) 1 — (Q') 1 >- 0, which is straightforward since 

(Q)- 1 - (Q')- 1 = B L — B' L = (b k - b' k )n L k (H^) H y 0. 

Thus, 

^trR^ (Q - Q') ( kj (a 2 ) (> kj (a 2 ) < 

since (kj(& 2 ) and Ckj( a2 ) do not depend on L. 
For (ii), we need to show that 

& fe ^trR^.Q-6' fe ^trR^.Q'>0. 
Similarly to the previous part of the proof, it is sufficient to show that (b k Q) 1 — (^Q') 1 -< 0. Hence, 

K 



■< 

since a 2 > 0, b k > b' k and b~i > for all I. ■ 

Consider now (x\,. . . , Xk) and (x' l7 . . . , x' K ), such that x k > x' k Vfc, and denote by (&i, . . . , 1>k) and (!>[,..., b' K ) 
the corresponding solutions to (57). Denote by Cfcj( <j2 ) and Q' k -{a 2 ) the unique solutions to (58) for (6i, . . . , T>k) and 
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{b\, . . . , b' K ), respectively. It follows from (59) that b k < b' k Vfc. Equation (60) now implies that ( k j(<r 2 ) > C'kj( (j2 ) 
and b~kCkj{v 2 ) < b' k (' k j(<J 2 ). Combining these results yields 



h k (xi,...,x K ) = - 



N k 

! ■ T' — I 1 : ." 

^l + b k Q k] {a 2 )' N^l + biQ^) / " :r '-- ' V ' 



which proves monotonicity. 

To prove scalability, let a > 1, and consider the following difference: 



ah k (x\,. . . ,x K ) - h k (axx, . . .,ax K ) 



1 



N k 

E 



a( kj (o ) 



+a,V)c&V) 



?=i 



[l + 6 fe Cfei(^ 2 )] 



1 + 



where we have denoted by b k ' the solution to (57) with x k replaced by ax k and by Ckj'i ' 2 ) me solution to (58) 



for 5<; a) . We have from (59)-(i) that < b k and from (59)-(ii) that 



(a) 



ax k b k a) > x k b k 



ab 



(a) 



& fe > 0. 



(62) 



It remains now to show that also a( k j(a 2 ) — Q" (er 2 ) > 0. To this end, consider the following difference: 

"CfcV) - CgV) = ^trR fe , (aT^ 2 ) - T(«)(a 2 )) 

where 

' 1 K N k 

nEE 



T(a 2 ) 



6feRfe,j 



T(a V)= ^EE 

V fe=ij=i 



a 2 I 



AT 



fc=l J 

X AT fc 



+ CT 2 I/Y 



By Lemma 6, it is now sufficient to show that (T( Q )(z)) 1 >- (qT(z)) 1 . Write therefore 

(tW^))" 1 -^^))" 1 



a 



in 



L K^\abk 



a) 



bk 



+ b ( k a) h 



aa,V)-c£V 2 ) 



R, 



fe=ij=i a [l + 6 fc Cfej(CT 2 )] 
The first summand is positive definite since a 2 > and a > 1. All other terms are also positive definite since 
ab ( k a) -b k >0 from (62) and al k a) b k Q kj {a 2 ) > &#CgV), since ab[ a) > b k and b k Q kj {a 2 ) > l ( k \ ( k f{o 2 ) 
by (60)-(ii) and (59)-(i). Since the sum of positive definite matrices is also positive definite, we have aQ k j((j 2 ) — 
( k °j\<J 2 ) > 0- This terminates the proof of scalability. 

Thus, we have shown h : (xi, . . . , x K ) ^ (hi, ... , h K ) to be a standard function. Moreover, from the fixed-point 
algorithms described in Theorem 1 and Theorem 9, and the fact that the ( k j are bounded (and therefore there exist 



xi,..., xk such that Xi > hi(x\, . . . , Xk) for each i), we have the following algorithm to compute b k and ( k j(a 2 ): 

At) 



lim b 



(*) 

k ' 
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where 



= ^trPfe^fcPfe + \c k - x k b^ ^ I„ fc ) 




-l 



and b k ' can take any value in [0, c k c k /x k ) and Q° (f 2 ) = 1/ct 2 for all 



Appendix E 



Proof of Theorem 4 



We begin by proving the following result: 



max \a k (a 2 ) - b k (a 2 )\ 



A: 



(63) 




a.s. 



o 



(64) 



where a k (a 2 ), a k (a 2 ) are defined in Theorem 1 and b k (<r 2 ), b k (a 2 ) are defined in Theorem 2, assuming that the 
matrices are random and modeled as described in (2). For notational simplicity, we will drop from now on the 
dependence on a 2 . From standard lemmas of matrix analysis, we have 



where the last step follows from Lemma 3. If cii were not dependent on h k j, we could now simply proceed by 
applying Lemma 4 to the individual quadratic forms, i.e.: 



where, in the following, for {o,n} and {&/v} two sequences of random variables, we denote <jn ~ biy the equivalence 
relation ajy — b^ for N — > oo. 

However, in order to show that this step is correct, in a similar manner as in the proof of Theorem 7, we need 
the following intermediate arguments. Define a^ fc j and a iik j as the unique solutions to the following fixed-point 
equations: 







&i,kj AT^^i (^ijkj^i l^k ^i.kj^i.kj^-rii]) 



-1 
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for i e {1, . . • , if}, where 

[hfei • • • hfej_ihfej + i • • • hfcjvj , fc = i 
Thus, aj ; fej and di^j are independent of hjy. Following similar steps as in the proof of Theorem 7 (Step 3), one 
can show that for i e {1, . . . , K} and all k, j, 

di,kj — a i 0, a it kj — a,i ^\ 0. (65) 

Thus, we have 



N 



K (£,*i SiHiHj 1 - a k h k Xj + ° 2 In) 1 h kj 
3=1 1 + a fe h]jV (£ i=1 OiHiHJ 1 - afch fcj -h^. + <t 2 LvJ h fej - 



E 



U1 , i h[[ (E* i Oi.fcjHiHj 1 - Ofc.fcjhfcj-hj^ + t 2 1 v ) li/ 

^ -/V y ( K \ ~ 1 

j=i 1 + afch^. ( a U-li;' - afc,fe 3 h fei h^ + ct 2 IatJ h kj 

( b) i ^ ^ trR fcj {Yh=i <',^vJT,TT h - a fcifcj -h fej hj^ + a 2 I N ^j 

j=i 1 + afe^trRfcj (E i=1 ", a JI.TT" - a^h^h^. + er 2 I,vJ 



(c) 1 



^trR fcj EtiWHH +(J 2 I 



E 

3=1 1 + a fc itrR fcj (Eti «*H,HH + a 2 !^) 



AT 4-f 1 



^trR fci T 



(66) 



J=1 1 + «fe jytrRfej-T 

where (a) follows from (65), (b) follows from Lemma 4 and Lemma 8, (c) is again due to (65) and Lemma 7, and 
(d) follows from an application of Theorem 9, where we have defined 



1 k N k _ _ 
1 x -> \ ^ a k n, k j 

~N f-~< 2-f 1 



+ o 2 l N 



kTijti l + 5fc^trR fcj T 

Note again that Theorem 9 cannot be directly applied here since the quantities aj depend on the matrices 
However, it is immediate to show that the result extends in this case, by replacing ai by a^/y at each necessary 
step of the proof. 
Hence, we can write 

K \ _1 , N k i R Hp 

jytriifej 1 

t " ik i - at n 

vt=l 

for some sequences of reals ejv,fe, satisfying ejv.fc — > 0. 



a k 
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Recall now the following definitions for k = 1, . . . , K : 

N k 1 . 



1 ^trRfcj-T 



N S 1 + a k ±tiR kj T 



iVu 



6fc 



a k 



1 ^trRfej-T 

l + hfc>-R fcj T 



N T^Ck- a k a k + a k Pkj 



h. = ^ 



V 51 ^ - , 



Pkj 



where 



T = 



N ~{c k - b k b k + b kPkj 

SfeRfej 



< a k < c k c k /a k 
< b k < c k c k /b k 



1 



K N k 



+ <t 2 Ia, 



fe=i j 

K N k 



1 x - \ - b k I 



fefeRfej 



k=lj=1 - ■ & fejv trR fcj T 

Denote P = maxfc{limsup||Pfc||}, P = max TO {limsup||R TO ||}, c + = maxt+limsupc/c} and c_ = minfejliminf c k }, 
c + = maxfc{limsupcfe}. Since we are interested in the asymptotic limit JV — > oo, we assume from the beginning 
that TV is sufficiently large, so that the following inequalities hold for all k: 

c k <c + , c^<c k <c + , ||Pfc|| < P, ||R fc j||<P. 



+ <J 2 1n 



We then have the following properties: 

P 



a k < 



, b k < 



(l-c+)c_ (l-c+)c_ 
For notational simplicity, we define the following quantities: 



b k b k < c + c + , a k a k < c + c + . 



(67) 



£ = max|a fe - b k \, £ = max \a k - b k \. 

k k 

We will show in the sequel that £ and £ as JV — > oo. 
Consider first the following difference: 



sup 



trR fe , (T - T) 



sup 

k,j 



sup 



1 

TV" 



trRfejT 



1 + 5,1 



a/Rjm 



6fR m 



^^l + a|itrR| m T 1 + b^trR^T^ 



J_ a; - + a;6; (^trR m T - ^trR m T) 1 

N hihi (l + «/>R m T) (1 + bjitrR^f) JV 

1 



trRfcjTR; m T 



< — rJv maxcfc 



cr* 1 



max |afe — &fc| + max \a k b k \ sup 



JV 



trR fej (T - T 



4 



< — Kc+ 



d-c + )^r 



i 

JV" 



trRfe, (T - T) 



51 



where the first equality follows from Lemma 2. Rearranging the terms yields: 



sup 

k,j 



N 



trR fej (T - T) 



< 



P 2 Ka 



+ 



Ft 2 P 2 
(l-c+) 2 c 2 _ 



ror a > (1 _ C+ ) E _ ■ 

Consider now the term £ = max/j |ofc — 6fe|: 



£ = max 

k 



N k 1 



< c+ sup 

kj 



E 

3 = 1 
1 



trRfej (T — T) + (b k — a^jftrUkjjj^^-kjT^ 



(1 + OfcitrRjyT) (1 + fefeitrR fej T) 



< 



TV 



trRfe, (T - T) 



+ c + — -max|a fe - b k \ + max |eAr jfc | 



(7 k 



7 C+i? 2 - 

fl 2 P 2 £ + + max|eiv,fe| 



(l-c+) 2 c 2 _ 

A 2 P 2 



+ 



c+R 2 



£ + max|e A r. fc | 

k 



(l-c+) 2 ci 

where the last inequality follows from (68). Similarly, we have for £ = maxj. ja^ — b k \: 



max 

k 



E 



Ofca fc - +pkj(bk - a,k) 



(c fe - a fe a fc + a k p k] )(c k - b k b k + b k p kj ) 



1 pL max fc K _ M maxfc [a k \a k -b k ]\+ max fc r& fe |a fe - 6 fe |l 
<— > -^-r- .. . VVkj 



(l-c + ) 2 c 2 



(1 - c+) 2 c 2 _ 



< 



1 + 



(l-c + ) 2 c 2 _ ^ ' (l-c+)c. 
Thus, for a 2 > max{( i ^, ( i4f) e _ }' we have 

2P 2 
(l-c+) 2 c 2 



PRc+ 



<j 2 (l -c+) 2 c 2 _ 



(1 - c+)c_ 



Replacing (70) in (69) leads to 



P 2 #c 2 



R 2 P 2 



(l-c+) 2 c 2 _ 



c+R 2 

<7 4 



2P 2 

(1 - c+) 2 c 2 _ 



1 

(1 - c+)c_ 



£ + max|e A r ife | 
fe 



For a 2 sufficiently large, we therefore have 



< i < Ce Ntk 







(68) 



(69) 



(70) 



for some C > 0. This implies that £ —> and, by (70), that £ . Since a k ,b k , a k ,b k have analytic extensions 
in a neighborhood of R_ (see the Proof of Theorem 7 for similar arguments) on which they are (almost surely) 
uniformly bounded, we have from Vitali's convergence theorem [40] that the almost sure convergence holds true 
for all a 2 € R+. This terminates the proof. 
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A. Convergence of the mutual information 

Consider now the first term of Vn{& 2 ) in Theorem 10. Due to the convergence of — t>k and the almost 
sure boundedness of the HfeHjj matrices, it follows that \\J2k=i(®k ~ &fc)HfcH£ || —> 0, and we can immediately 
conclude, by convergence mapping arguments, that 



^ log dct (l N + ^ S fc H fc H^ - ^ log dot (l N + ^ H fe H fe^ 



Applying Corollary 3 to the second term yields 



0. 



^logdet ^Ijv + ^^ftfcHfcHj^ -V N (a 2 )^0. 



Consider now lffi(a 2 ) and (a 2 ) as defined in Theorems 3 and 4. It follows from (63), (64) and (71), that 



(71) 



4°V)-^ U V) 



(b) i 2\ a - s \ 



0. 



This implies also that 



4V)--TEV) 



■(ft), 2\ 



0. 



(72) 



To prove convergence in the mean, we can no longer use the fact that 1$ (<? 2 ) is bounded for all N as in Appendix B, 
which is now untrue. Instead, we will use the same arguments as in [5]. Denote 

K N k 

§]§i + M-*)Cw(-s) 



i 1 / 1 K Nk 

m%\z) = - tr( Bjv zI N )-\ m%\z) = - tr - ]T £ - 



-zl 



N 



where m$(z) is the Stieltjes transform of B^. It is easy to see that 



Em^(-w) 



1 -(b)f \ 

- -mV(-w) 



We now apply the argument from [5, pp. 923] which shows that 

1 -(b) , \ 



< 



/;( 


"1 






,00 j 




Ax* ^ 


( 








+ 









1 v^v^ bk(co)R.k,j 



dw 



the right-hand side of which exists for all N and is uniformly bounded by ^(KPR). Since m^(— u>)— m^ j (— w) — — ^ 
(as a consequence of the convergence — 6^ ^> 0), the boundedness of m^(— w) then ensures (by dominated 
convergence) that - m^(-w) ->• 0. Since the integrand tends to zero and is summable independently 

of N, the dominated convergence theorem now ensures that 

e4V)-4V)^o. 



Proof of Theorem 6: The proof follows directly from (63), (64), and Theorem 5. 



53 



Proof of Corollary 2: The almost sure convergence follows directly from Theorem 4 and the continuous 
mapping theorem [42, Theorem 2.3]. For the convergence in mean, note first that, as a standard result of information 
theory, I ( p(cr 2 ) - i?^(cr 2 ) > for all N. Consider now the extended matrix model where H£ e C LNxLN " is 
defined in (61), P£ = P fe <8> I L e C Ln " xLnk and W£ e C LJVfcXL " fc is constructed from Ln k columns of a 
LNk x LNk random unitary matrix. Denote lff L (a 2 ) and B^ L {<r 2 ) the associated mutual information and MMSE 
sum-rate for this channel model. One can verify that for this model and by Theorem 4 and the convergence of 
R$(<r 2 ) - R { n\(t 2 ) in the almost sure sense, the following holds 

4> 2 )^4V) 

L—too 
L— >oo 

Thus, 

) - K N,L\ (T I - 1 N,L\ a ) - 1 N V a ) + J N V a ) ~ H N \ a ) ' N \ )~ H N.L{ a ) 



from which we can conclude that I^\a 2 ) — R$(a 2 ) > for all N. Using this result, it follows that 



JV 

Since 2 sup^y /j^ (<r 2 ) < oo and Evn — > 2sup 7V I^\a 2 ) by Theorem 4, it finally follows from [46, 

Problem 16.4 (a)] that 

Ei$V 2 )-££V)^0. 



Appendix F 
Fundamental lemmas 

Lemma 1 (Defining properties of Stieltjes transforms, Theorem 3.2 in [11]): If m is a function analytic on C + 
such that m(z) e C + if z € C + and 

lim — iy m(iy) = 1 (73) 

y->oo 

then m is the Stieltjes transform of a distribution function F given by 

c6 



1 f 

F(6) — F(a) = lim — / Im[m(x + iy)]cix. 



If, moreover, zm{z) € C + for z € C + , then F(0 ) = 0, in which case m has an analytic continuation on C\M+. 
Lemma 2 (Resolvent identity): For invertible matrices A and B, we have the following identity: 

A 1 B 1 = A _1 (B-A)B _1 . 
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Lemma 3 (A matrix inversion lemma, Equation (2.2) in [43]): Let A <E <C NxN be Hermitian invertible, then for 
any vector x e and any scalar tgC such that A + rxx H is invertible 

x H A-! 



x h (A + txx h ) 



H\-l 



1 + rx H A -1 x' 



Lemma 4 (Trace lemma [34, Lemma 2.7]): Let Ai, A 2 , . . . , with A^v S C NxN , be a sequence of matrices with 
uniformly bounded spectral norm and let xjy =£ C N be random vectors of i.i.d. entries with zero mean, variance 
1/N and eighth order moment of order 0(1 /N 4 ), independent of A at. Then, as N — > oo, 

x^ Ajvxat - ^tvA N ^ 0. (74) 

Lemma 5 (Trace lemma for isometric matrices, [8]): Let W be n < N columns of an N x N Haar matrix and 
suppose w is a column of W. Let be an N x N random matrix, which is a function of all columns of W 
except w and B = sup N ||Bjv|| < oo, then 

C 



E 



w H B w w - — - — trfnBjv} 

N-n y 



~ N 2 ' 



where TT = T \ WW H + ww H and C is a constant which depends only on B and jj. 

Lemma 6 (Trace inequality): Let A,B,R G C NxN , where A, B, and R are nonnegative-definite, satisfying 
B y A. Then 

trR (A" 1 - B" 1 ) > 0. (75) 

Proof: Note that B >- A implies by [44, Corollary 7.7.4] B 1 -< A -1 . Thus, for any vector x e C N , 

x H (A" 1 -B _1 )x > 0. (76) 

Consider now the eigenvalue decomposition of the matrix R = UAU H , where U = [ui,...,Ujv] and A = 
diag(Ai, . . . , Ajv). Since Ai > Vi, we have 

N 

trR (A" 1 - B- 1 ) = AiiiJ 1 (A" 1 - B" 1 ) u 4 > 0. (77) 

i=l 

■ 

Lemma 7 (Rank-1 perturbation lemma [43]): Let z < 0, A e C NxN , B € C NxN with B Hermitian nonnega- 
tive definite, and veC". Then, 

|tr ((B - zIn)- 1 - (B + vv H - zIn)' 1 ) A| < 

Lemma 8: [15, Lemma 1] Denote ajv, a/v, 6jv and Sjv four infinite sequences of complex random variables 
indexed by N and assume x »jv and 6^ x T>n- If |ffljv|. |&jv j and/or |ajv|,|&jv| are uniformly bounded above 
over N (almost surely), then ajv&jv ~ a-N^N- Similarly, if |ajv|, |&iv| _1 and/or |ajv|,|&jv| _1 are uniformly bounded 
above over N (almost surely), then a^/fr/v ^aN/b~N- 
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Lemma 9 (Tonelli theorem [39, Theorem 18.3]): If (O,?, P) and (n'^'jP') are two probability spaces, then 
for / an integrable function with respect to the product measure Q on J x T ', 



In. 



f(x,y)Q(d(x,y)) 



and 



f(x,y)P'(dy) 



[ f(x,y)Q(d(x,y))= [ \ [ f(x,y)P(dy) 
Jnxfi' Jn> Un 



P(dx) 
P'(dx). 



Appendix G 
Related results 

iJVx i 



Theorem 9 ([1, Theorem 1]): Let Bjy = XX H , where XeC x " is random. The jth column Xj of X is given 



as Xj = Rj Zj, where the entries of Zj G C are i.i.d. with zero mean, variance 1/N and finite moment of order 
4 + e, for some common e > 0, and Rj e i^ NxN are Hermitian nonnegative definite matrices. Let T>n e ^NxN 
be a deterministic Hermitian matrix. Assume that both Rj and Djy have uniformly bounded spectral norms (with 
respect to N). Then, as n, N — > oo such that < liminf7V/n < limsup N/n < oo, the following holds for any 

zeC\R+: 

^u-Dat (B n - zInY 1 - ^uB N T N (z) ^ 



where Tn(z) e C NxN j s defined as 



and where <5i(z), . . . , 8 n (z) are given as the unique solution to the following set of implicit equations: 

1 ( 1 n R \ 1 

s ^ = n^[n^TTW)- zIn ) ' >' = 1 '-' n (78) 

such that (<5i(z), . . . , S n (z)) e S™. For z < 0, <$i(z)j . . . , 5n,ti{z) are the unique nonnegative solutions to (78) and 
can be obtained by a standard fixed-point algorithm with initial values S^\z) = — 1/ z for j = 1, . . . ,n. Moreover, 
let Fn be the empirical spectral distribution (e.s.d.) of and denote by F/v the distribution function with Stieltjes 
transform j^trT N (z). Then, almost surely, 



N — *N 



0. 



Theorem 10 ([45]): Under the assumptions of Theorem 9, let a 2 > and define Vn((t 2 ) — logdet (Ijv + ^2 Bjv). 
Then, as N, n — > 00, 

EV,v((T 2 ) - V N {a 2 ) ^ 

where 



/ 1 1 " p \ 1 ™ 1 " 
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and where Sj = 5j(—<7 2 ) for j = 1, . . . ,n are given by Theorem 9. 

Corollary 3: Under the assumptions of Theorem 10, assume additionally that the matrices Rj, j = 1, . . . , n, are 
drawn from a finite set of Hermitian nonnegative-definite matrices. Then, as N,n — > oo, 

V N (a 2 )-V N (a 2 )^0 (79) 

where Vn{<J 2 ) and V/v(ct 2 ) are defined as in Theorem 10. 

Proof: It was shown in [2, Proof of Theorem 3] that B N has almost surely uniformly bounded spectral norm 
as N, n -> oo if the matrices are drawn from a finite set of matrices. Thus, F N and F N as defined in Theorem 9 
have (almost surely) bounded support. Consider now a set A C fi, £1 generating the matrices B^, for which 
has bounded spectral norm, and a set B C for which F N - F N 0. Since P(A) = = P(A flB) = l, it 

follows from [46, Theorem 25.8 (ii)], that, as N, n -» oo 

y log(l + a;- 1 A)rfF A r(A) - ^ log(l + aT^dF^A) ^ (80) 

which is equivalent to stating that Vn(x) — Vn(x) —> 0. ■ 
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